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Abstract 

The quantum circuit model is the most widely used model of quantum computation. It 
provides both a framework for formulating quantum algorithms and an architcctTirc for the 
physical construction of quantum computers. However, several other models of quantum 
computation exist which provide useful alternative frameworks for both discovering new 
quantum algorithms and devising new physical implementations of quantum computers. 
In this thesis, I first present necessary background material for a general physics audience 
and discuss existing models of quantum computation. Then, I present three new results 
relating to various models of quantum computation: a scheme for improving the intrinsic 
fault tolerance of adiabatic quantum computers using quantum error detecting codes, a 
proof that a certain problem of estimating Jones polynomials is complete for the one clean 
qubit complexity class, and a generalization of perturbative gadgets which allows fe-body 
interactions to be directly simulated using 2-body interactions. Lastly, I discuss general 
principles regarding quantum computation that I learned in the course of my research, and 
using these principles I propose directions for future research. 
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Title: Professor of Physics 
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Chapter 1 

Introduction 



1.1 Classical Computation Preliminaries 

This thesis is about quantum algorithms, complexity, and models of quantum computation. 
In order to discuss these topics it is necessary to use notations and concepts from classical 
computer science, which I define in this section. 

The "big-0" family of notations greatly aids in analyzing both classical and quantum 
algorithms without getting mired in minor details. Although it may seem like a trivial 
notation, it is the first step in a chain of increasing abstraction which allows computer 
scientists to analyze the general laws of computation which apply whether the computer is 
using base 2 or base 20, and whether it is made of transistors or tinker toys. By following this 
chain we will reach the major open questions about complexity classes and their relations 
to one another. The big-0 notation is defined as follows. 

Definition 1. Given two functions f{n) and g{n), f{n) is 0{g{n)) if there exist constants 
no and c > such that |/(n)| < c\g{n)\ for all n > tiq. 

Definition 2. Given two functions f{n) and g{n), f{n) is ^l{g{n)) if there exist constants 
uq and c > such that |/(n)| > c|5((n)| for all n > tiq. 

Definition 3. Given two functions f{n) and g{n), f{n) is @{g{n)) if it is 0{g{n)) and 
n{g{n)). 

Thus, O describes upper bounds, Q describes lower bounds, and 6 describes asymptotic 
behavior modulo an overall multiplicative constant. The standard way to describe the effi- 
ciency of an algorithm is to use "big-0" notation to describe the number of computational 
steps the algorithm uses to solve a problem as a function of number of bits of input. For 
example, the standard method for multiplying two ra-digit numbers that is taught in elemen- 
tary school uses 0(n^) elementary operations in which individual digits are manipulated. 
By using big-0 notation we can avoid distinguishing between 2n^ and 50n^ which allows us 
to disregard unnecessary details. 

Moving one level higher in abstraction, we reach computational complexity classes. 
These are sets of problems solvable with a given set of computational resources. For example, 
the complexity class P is the set of problems solvable on a Turing machine in a number of 
steps which scales polynomially in the number of bits of input n, that is, with 0{n'^) steps 
for some constant c. Note that for a problem to be contained in P, all problem instances of 
size n must be solvable in time poly(n), including highly atypical worst-case instances. 
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Complexity classes are usually defined in terms of decision problems. These are problems 
that admit a yes/no answer, such as the problem of determining whether a given integer 
is prime. Many problems are not of this form. For example, the problem of factoring 
integers has an output which is a list of prime factors. However, it turns out that in almost 
all cases, problems can be reduced with polynomial overhead to decision versions. For 
example, consider the problem where, given two numbers a and b, you are asked to answer 
whether a has a prime factor smaller than b. Given a polynomial time algorithm solving this 
problem, one can construct a polynomial time algorithm for factoring using this algorithm 
as a subrouting Thus by considering only decision problems (or more technically, the 
associated languages), complexity theorists are simplifying things without losing anything 
essential. Because problems are usually equivalently hard to their decision versions, we will 
often gloss over the distinction between the problems and their decision versions in this 
thesis. 

Some complexity classes describe models of computation which are essentially realistic. 
P describes problems solvable in polynomial time on Turing machines. Until recently, every 
plausible deterministic model of universal computation has led to the same set of problems 
solvable in polynomial time. That is, all models of universal computation could be simulated 
with polynomial overhead by a Turing machine, and vice versa. Thus the complexity class P 
was regarded as a robust description of what could be efficiently computed deterministically 
in the real world, which captures something fundamental and is not just an artifact of the 
particular model of computation being studied. 

For example, in the standard formulation of a Turing machine, each location on the 
tape can take two states. That is, it contains a bit. If you instead allow d states (a "dit") 
then the speedup is only by a constant factor, which already disappears from our notice 
when we use big-0 notation. Furthermore, even parallel computation, although useful in 
practice, does not generate a complexity class distinct from P, provided one allows at most 
polynomially many processors as a function of problem size. 

One may ask why polynomial time is chosen as the definition of efficiency. Certainly it 
would be a stretch to consider an algorithm operating in time n^^ efficient, or an algorithm 
operating in time 2^^-^^^^^"-^ inefficient. There are several reasons for using polynomial time 
as a mathematical formalization of efficiency. First of all, it is mathematically convenient. 
It is a robust definition which allows one to ignore many details of the implementation. 
Furthermore, asymptotic complexity is more robust than the complexity of small instances, 
which can be influenced by the presence of lookup tables or other preprocessed information 
hidden in the program. Secondly, it appears to do a good job of sorting the efficient 
algorithms from the inefficient algorithms in practice. It is rare to obtain a polynomial time 
classical algorithm with runtime substantially greater than or a superpolynomial time 
algorithm with runtime substantially less than 2"". Furthermore, whenever polynomial time 
algorithms are found with large exponents, it usually turns out that either the runtime in 
practice is much better than the worst case theoretical runtime, or a more efficient algorithm 
is subsequently found. 

Sometimes a problem not known to be in P seems to be efficiently solvable in practice. 
This can happen either because the problem is not in P but the worst case instances are 
hard to construct, or because the problem actually is in P but the proof of this fact is diffi- 
cult. Linear programming provides an interesting and historically important example of the 
latter. For this problem the best known algorithm was for a long time the simplex method, 

^This, like many reductions to decision problems, can be done using the process called binary search. 
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which had exponential worst-case complexity, but was generally quite efficient in practice. 
An algorithm with polynomial time worst-case complexity has since been discovered. 

One can also consider probabilistic computation. That is, one can give the computer 
the ability to generate random bits, and demand only that it give the correct answer to 
a problem with high probability. It is clear that the set of problems solvable in this way 
contains P and possibly goes beyond it. The standard formalization of this notion is the 
complexity class BPP, which is defined as the set of decision problems solvable on a prob- 
abilistic Turing machine with probability at least 2/3 of giving the correct answer. (BPP 
stands for Bounded-error Probabilistic Polynomial-time.) Note that, like P, BPP is defined 
using worst-case instances. The probabilities appear not by randomizing over problem in- 
stances, but by randomizing over the random bits used in the probabilistic algorithm. The 
probability 2/3 appearing in the definition of BPP may appear arbitrary, and in addition, 
not very high. However, choosing any other fixed probability strictly between 1/2 and 1 
yields the same complexity class. This is because one can amplify the success probability 
arbitrarily by running the algorithm multiple times and taking the majority vote. 

Prior to the discovery of quantum computation, no plausible model of computation was 
known which led to a larger complexity class than BPP. Just as all plausible models of 
classical deterministic computation turned out to be equivalent up to polynomial overhead, 
the same was true for classical probabilistic computation. Furthermore, it is now generally 
suspected that BPP=P. In practice, randomized algorithms usually work just fine if the 
random bits are replaced by pseudorandom bits, which although generated by deterministic 
algorithms, pass most naive tests of randomness (e.g. the various means and correlations 
come out as one would expect for random bits, obvious periodicities are absent, and so 
forth). The conjecture that P=BPP is currently unproven, and finding a proof is a major 
open problem in computer science. (Until recently, the problem of primality testing was 
known to be in BPP but not P, increasing the plausibility that the classes are distinct. 
However, a deterministic polynomial-time algorithm for this problem was recently discov- 
ered.) The notion that BPP captures the power of polynomial time computation in the real 
world was eventually formalized as the strong Church- Turing thesis, which states: 

Any "reasonable" model of computation can be efficiently simulated on a probabilistic Tur- 
ing machine. 

If BPP=P then dropping the word "probabilistic" results in an equivalent claim. The 
strong Church- Turing thesis is named in reference to the original Church- Turing thesis, 
which states 

Any "reasonable" model of computation can be simulated on a Turing machine. 

The original Church-Turing thesis is a statement only about what is computable and what 
is not computable, where no limit is made on the amount of time or memory which can be 
used. Thus we have reached a very high level of abstraction at which runtime and mem- 
ory requirements are ignored completely. It is perhaps not intuitively obvious that with 
unlimited resources there is anything one cannot compute. The fact that uncomputable 
functions exist was a profound realization with an simple proof. One can see by Cantor 
diagonalization that the set of all decision problems, i.e. the set of all maps from bitstrings 
to {0, 1}, is uncountably infinite. In contrast, the set of all computer programs, which can 
be represented as bitstrings, is only countably infinite. Thus, computable functions make 
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up an infinitely sparse subset of all functions. 

This leaves the question of whether one can find a natural function which is uncom- 
putable. In 1936, Alan Turing showed that the problem of deciding whether a given pro- 
gram terminates is undecidable. This can be proven by the following simple reductio ad 
ahsurdum. Suppose you had a program A, which takes two inputs, a program, and the data 
on which the program is to act. A then answers whether the given program halts when run 
on the given data. One could use ^ as a subroutine to construct another program B that 
takes a single input, a program. B determines whether the program halts when given itself 
as the data. If the answer is yes, then B jumps into an infinite loop, and if the answer is 
no then B halts. By operating B on itself one thus arrives at a contradiction. 

As we shall see in section 11.31 quantum computers provide the first significant challenge 
to the strong Church- Turing thesis. In general, it is difficult to prove that one model of 
computation is stronger than another. By discovering an algorithm, one can show that a 
given model of computation can solve a certain problem with a certain number of computa- 
tional steps. However, in most cases it is not known how to show that no efficient algorithm 
on a given model of computation exists for a given problem. In 1994, Peter Shor discovered 
a quantum algorithm which can factor any n-bit number in 0{rfi) time [160] . There is no 
proof that this problem cannot be solved in polynomial time by a classical computer. How- 
ever, no polynomial time classical algorithm for factoring has ever been discovered, despite 
being studied since at least 200BC (c/. sieve of Eratosthenes). Furthermore, factoring has 
been well-studied in modern times because a polynomial time algorithm for factoring would 
allow the decryption of the RSA public key cryptosystem, which is used ubiquitously for 
electronic transactions. The fact that quantum computers can factor efficiently and classi- 
cal computers can't is one of the strongest pieces of evidence that quantum computers are 
more powerful than classical computers. 

A second piece of evidence for the power of quantum computers is that quantum algo- 
rithms are known which can efficiently simulate the time evolution of many-body quantum 
systems, a task which classical computers apparently cannot perform despite decades of 
effort along these lines. The search for polynomial time classical algorithms to simulate 
quantum systems has been intense because they would have large economic and scientific 
impact. For example, they would greatly aid in the design of new materials and medicines, 
and could aid in the understanding of mysterious condensed-matter systems, such as high- 
temperature superconductors. 

Quantum computers do not provide a challenge to the original Church- Turing thesis. As 
we shall see in section [L2l the behavior of a quantum computer can be completely predicted 
by muliplying together a series of exponentially large unitary matrices. Thus any problem 
solvable on a quantum computer in polynomial time is solvable on a classical computer in 
exponential time. Hence quantum computers cannot solve problems such as the halting 
problem. 

Other complexity classes relating to realistic classical models of computation have been 
defined. (See [141j for overview.) These are weaker than BPP. The most important of these 
for the purposes of this thesis are L and NCI. L stands for Logarithmic space, and NC 
stands for Nick's Class. L is the set of problems solvable using only logarithmic memory 
(other than the memory used to store the input). The class NCI is the set of problems 
solvable using classical circuits of logarithmic depth. Similarly, NC2 is the set of problems 
solvable in depth 0(log^(n)), and so on. Roughly speaking, NCI can be identified as those 
problems in P which are highly parallelizable. For a detailed explanation of why this is a 
reasonable intepretation of this complexity class see |141] . For an illustration of the meaning 
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Figure 1-1: The most common way to measure the complexity of a circuit is the number 
of gates, in this case five. However, one can also measure the depth. In this example, the 
circuit is three layers deep. The number of gates corresponds to the number of steps in 
the corresponding sequential algorithm. The depth corresponds to the number of steps in 
the corresponding parallel algorithm, since gates within the same layer can be performed 
in parallel. 



of circuit depth see figure flTTl 

I have described NCI using logic circuits. The classical complexity classes such as P 
and BPP can also be defined using logic circuits such as the one shown in figure 11-11 A 
given circuit takes a fixed number of bits of input (four in the circuit of figure [ITT]) . Thus, 
an algorithm for a given problem corresponds to an infinite family of circuits, one for each 
input size. It is tempting to suggest that P consists of exactly those problems which can 
be solved by a family of circuits in which the number of gates scales polynomially with the 
input size. However, this is not quite correct. The problem is that we have not specified how 
the circuits will be generated. It is unreasonable to specify an algorithm by an infinitely 
long description containing the circuit for each possible input size. Such arbitrary families 
of circuits are called "nonuniform" . 

The set of problems solvable by polynomial size nonuniform circuits may be much larger 
than P, because one can precompute the answers to the problem and hide them in the 
circuits. One can even "solve" uncomputable problems this way. A uniform family of 
circuits is one such that given an input size n, one can efficiently generate a description of 
the corresponding circuit. One may for example demand that a fixed Turing machine can 
produce a description of the circuit corresponding to n, given n as an input. In practice, 
a family of circuits is usually described informally, such that it is easily apparent that it is 
uniform. The set of decision problems efficiently solvable by a uniform family of polynomial 
size circuits is exactly P. 

While discussing circuits, it bears mentioning that the set of gates used in figure 11-1] 
namely AND, OR, and NOT, are universal. That is, any function from n bits to one bit 
(here we are again restricting to decision problems out of convenience and not necessity) 
can be computed using some circuit constructed from these elements. The proof of this is 
fairly easy, and the interested reader may work it out independently. A solution is given in 
appendix El Note that the number of possible functions from n bits to one bit is 2^" . In 
contrast the number of possible circuits with n gates is singly exponential in n. Thus, most of 
the functions on n bits must have exponentially large circuits. Both this universality result 
and this counting argument have quantum analogues, which are discussed in subsequent 
sections. 

For this thesis, P, BPP, L, and NCI are a sufficient set of realistic classical complexity 
classes to be familiar with. We'll now move on to describe a few of the more fanciful classes. 
These describe models of computation which are not realistic and classes of problems not 
necessarily expected to be efficiently solvable in the real world. The most important of these 
is NP. NP stands for Nondeterministic Polynomial-time. Loosely speaking, it is the set of 
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problems whose solutions are verifiable in polynomial time. NP contains P, because if you 
have a polynomial time algorithm for correctly solving a problem, you can always verify a 
proposed solution in polynomial time by simply computing the solution yourself. 

More precisely, NP is defined in terms of witnesses (also sometimes called proofs or 
certificates). These are simply bitstrings which certify the correctness of an answer to a 
given problem. NP is the set of decision problems such that there exists a polynomial time 
algorithm (called the verifier), such that if the answer to the instance is yes, there exist a 
biststring of polynomial length (the witness), which the verifier accepts. If the answer to 
the instance is no, then the verifier will reject all inputs. This definition can be illustrated 
using Boolean satisfiability, which is a cannonical example of a problem in NP. The problem 
of Boolean satisfiability is, given a Boolean formula on n variables, determine whether there 
is some assignment of true/false to these variables which makes the Boolean formula true. 
The witness in this case is a string of n bits listing the true/false values of each of the 
variables. The verifier simply has to substitute these values in and evaluate the Boolean 
formula, a task easily doable in polynomial time. 

NP apparently does not correspond to the set of problems efficiently solvable using 
any realistic model of computation. Why then would anyone study NP? One reason is 
that, although there is clearly more practical interest in understanding which problems 
are efficiently solvable, there is certainly some appeal at least philosophically, in knowing 
which problems have efficiently verifiable solutions. Perhaps the most important motivation, 
however, is that by introducing a strange model of computation such as nondeterministic 
Turing machines, we gain a tool for classifying the difficulty of computational problems. 

When faced with a difficult computational problem, it is very difficult to know whether 
one's inability to find an efficient algorithm is fundamental or merely a failure of imagination. 
How can one know whether it is time to give up, or whether the solution around the 
next corner? Complexity classes give us two handles on the difficulty of a computational 
problem: containment and hardness. Containment is the more straightforward of the two. If 
a problem is contained in a given complexity class, then it can be solved by the corresponding 
model of computation. In a sense this gives an upper bound on the problem's difficulty. 
The less obvious concept is hardness. In computer science, "hardness" is a technical term 
with a precise meaning different from its common usage. If a problem is hard for a given 
complexity class, this means that any problem in that class is reducible to an instance of 
that problem. For example, if a problem is NP-hard, it means that any problem contained 
in NP can be reduced to an instance of that problem in polynomial time and with at most 
polynomial increase in problem size. Thus, up to polynomial factors, an NP-hard problem 
is at least as hard as any problem in NP. If one could solve that problem in polynomial 
time, then one could solve all NP problems in polynomial time. 

It not obvious that NP-hard problems exist. After all, how could one ever show that 
every single problem in NP reduces to a given problem? We don't even know what all the 
problems in NP are! We'll use Boolean satisfiability as an example to see how it is in fact 
possible to prove that a problem is NP-hard. As discussed earlier, logic circuits made from 
AND, OR, and NOT gates form a universal model of computation, equal in power (up to 
polynomial factors) to the Turing machine model. (See appendix |Al) Thus the verifier for 
a problem in NP can be constructed as a logic circuit from such gates. Such a logic circuit 
corresponds directly to a Boolean formula made from AND, OR, and NOT. This formula 
will be satisfiable if and only if there exists some input (the witness) which causes the verifier 
to accept. Thus we have proven that Boolean satisfiability is NP-hard. Given this fact, one 
can then prove the NP-hardness of other problems by reductions of Boolean satisfiability to 
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other problems. Boolean satisfiability has the property that it is both contained in NP and 
it is NP-hard. Such problems are called NP-complete. In a well-defined sense, NP-complete 
problems are the hardest problems in NP. Furthermore, if one specifies a problem and says 
it is complete for class X, then that statement uniquely defines complexity class X. 

Boolean satisfiability is not the only NP-complete problem. In fact, there are now hun- 
dreds of NP-complete problems known. (See [76j for a partial catalog of these.) Remarkably, 
experience has shown that if a well-defined computational problem resists all attempts to 
find polynomial time classical solution, it almost always turns out to be NP-hard. There 
are only a few problems currently known which are believed to be neither in P nor NP- 
hard. These include factoring, discrete logarithm, graph isomorphism, and approximating 
the shortest vector in a lattice. If a problem is NP-hard, this is taken as evidence that the 
problem is not solvable in polynomial time. If it were, then all of NP would be solvable in 
polynomial time. This is considered unlikely, becaue it seems contrary to experience that 
verifying the solution to a problem is fundamentally no harder than finding the solution. 
Furthermore, it seems unlikely that all those hundreds of NP-complete problems really do 
have polynomial-time solutions which were never discovered despite tremendous effort by 
very smart people over long periods of time. On the other hand, there is no proof that all 
of NP is not solvable in polynomial time. This is the famous P vs. NP problem which, for 
various reason^ is thought to be very difficult. 

NP is not the only complexity class based on a non-realistic model of computation. 
Another important class is coNP. This is the set of problems which have witnesses for the 
no instances. In other words, these problems are the complements of the problems in NP. 
NP and coNP overlap but are believed to be distinct. The problem of factoring integers is 
known to be contained in both NP and coNP. This is one reason factoring is not believed 
to be NP-complete. If it were then NP would be contained in coNP. Graph isomorphism is 
also suspected to be contained in the intersection of NP and coNP. MA is the probabilistic 
version of NP, where the verifier is a BPP machine rather than a P machine. PSPACE is 
the set of problems solvable using polynomial memory. Polynomial space is a very powerful 
model of computation. The class PSPACE contains both NP and coNP and is believed 
to be strictly larger than either. #P is like NP except to answer a ^^P problem one must 
count the number of witnesses rather than just answering whether any witnesses exist. #P 
is therefore not a decision class. To make comparisons between #P and decision classes one 
often uses P"^^, which is the set of problems solvable by a polynomial time machine with 
access to an "oracle" which at any timestep can be queried to solve a #P problem. Many 
more complexity classes have been defined (see [I]). However the ones described above will 
suffice for this thesis. 

As is apparent from the preceeding discussion, many complexity-theoretic results are 
founded on widely accepted conjectures, such as the conjecture that P is not equal to NP. 
This is perhaps an unfamiliar situation. These conjectures are neither proven mathematical 
facts, nor are they the familiar sort of empirical facts based on physical experiments. They 
are instead empirical facts based on mathematical evidence. How can one assign probability 
of correctness to mathematical conjectures? Does it even make sense to do sco? These are 

^In addition to the failed attempts by many smart people to find a proof that P 7^ NP, there are additional 
reasons to believe that finding a proof should be hard. Namely, theorems have now been proven which show 
that the most natural methods for proving whether P is equal to NP are irrefutably doomed from the 
start [146]. 

^To give a more specific example, suppose you conjectured that P 7^ NP. Then you proposed various 
polynomial time algorithms for NP-hard problems. Whether each of these algorithms work depends on 
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Figure 1-2: This diagram summarizes known and conjectured relationships between the 
classical complexity classes discussed in this section. All of the containments shown have 
been proven. However, none of the containments have been proven strict other than P is a 
strict subset of EXP and L is a strict subset of PSPACE. 



interesting philosophical questions, but to my knowledge unresolved ones. In any case, 
they are beyond the scope of this thesis. In practice the conjecture that P is not equal to 
NP is almost universally believed by the relevant experts. Many other similar complexity- 
theoretic conjectures are often also considered be well-founded, although not necessarily as 
much so as P 7^ NP. 

In the presence of all this conjecturing, it is worth mentioning that some relationships 
between complexity classes are known with certainty. One thing that is known in general 
is that the class defined by a space bound of n is contained in the class defined by a time 
bound of 2". This is because any algorithm running for time longer than 2" with only n 
bits of memory necessarily revisits a state it has already been in, and is therefore in an 
infinite loop. Thus any problem solvable in logarithmic space is solvable in polynomial 
time, and any problem solvable in polynomial space is solvable in exponential time. In 
general, containments are easier to prove the separations. For example, it is trivial to show 
that P is contained in NP, but nobody has ever succeeded in showing that NP is larger 
than P. An exception to this is that separations are not hard to prove between classes 
of the same type. For example, it is proven that exponential time (EXP) is a strictly 
larger class than polynomial time (P), and polynomial space (PSPACE) is a strictly larger 
class than logarithmic space (L). In fact, it is even possible to prove that there exist some 
problems solvable in time O(n^) not solvable in time O(n^). This is done using a method 
called diagonalization ["141j . However, the argument is essentially non-constructive, and it 
is generally not known how to prove unconditional lower bounds on the amount of time 
needed to solve a given problem. 

All of the complexity theory described so far has been about problems where the input 

various calculations the result of which are not obvious a priori. Upon performing the calculations, one 
finds in every case that they come out in just such a way that the polynomial-time algorithms for the NP- 
hard problems fail. Can one somehow use Bayesian reasoning in this case, regarding the calculations as 
experiments and their outcomes as evidence in favor of the conjecture P 7^ NP? 
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is given as a string of bits. However, one can also imagine providing the input in the form 
of an oracle. An oracle is a subroutine whose code is hidden. One then computes some 
property of the oracle by making queries to it. For example, the oracle might implement 
some function / : {1,2,..., m} {1,2,..., n}, and we want to compute /(x). We 

can do this by querying the oracle m times, once for each value of x, and summing up 
the results. It is also clear that this cannot be done by querying the oracle fewer than m 
times. This demonstrates a very nice feature of the oracular setting, which is that it is often 
possible to prove lower bounds on the number of queries necessary for computing a given 
property. 

The oracular model of computation is artificial, in the sense that we have artificially 
prohibited access to the code implementing the oracle. However, in many settings it seems 
unlikely that examining the source code would help. Even simple functions that can be 
written down using a small number of algebraic symbols often lack analytical antiderivatives, 
and to find the definite integral there seems to be nothing better to do than evaluate the 
function at a series of points and use the trapezoid rule or other similar techniques. This is 
exactly the oracular case. Similarly, Newton's method for finding roots, and gradient descent 
methods for finding minima are both oracular algorithms. If the function is implemented 
by some large and complicated numerical calculation then it seems even more likely that 
for finding integrals, derivatives, extrema, and so on, there is nothing better to be done 
than simply querying the function at various points and performing computations with 
the resulting data. For these reasons, and because query complexity is much more easily 
analyzed than computational complexity, the oracular setting is an important area of study 
in both classical and quantum computation. 

1.2 Quantum Computation Preliminaries 

Because this is a physics thesis, I'll assume familiarity with quantum mechanics. Many 
standard books exist on the subject [5011128^ 11541185] . However, the emphasis in these books 
is not necessarily placed on the aspects of quantum mechanics which are most necessary for 
quantum computing. A nice brief quantum-computing oriented introduction to quantum 
mechanics is given in the second chapter of |137| . 

To reason about quantum computers, one needs a mathematical model of them. In fact, 
as I will argue in this thesis, it is helpful to have several mathematical models of quantum 
computers. The most widely used model of quantum computation is the quantum circuit 
model, and I will now describe it. 

The first concept needed to define a quantum circuit is the qubit. Physically, a qubit is 
a two state quantum mechanical system, such as a spin- 1/2 particle. As such, its state is 
given by a normalized vector in . One normally imagines doing quantum computation by 
performing unitary operations on an array of qubits. One could of course use d-state systems 
with d > 2. Using d-dimensional units (called qudits) generally results in only a speedup 
by a constant factor, which will not even be noticed if one is using big-0 notation. Since 
it makes no difference algorithmically, people almost always choose the lowest dimensional 
nontrivial systems for simplicity, and these are qubits. This is analogous to the classical 
case. In addition to their physical interpretation, qubits have meaning as the basic unit 
of quantum information. This meaning arises from the study of quantum communication, 
sometimes known as quantum Shannon theory. Quantum Shannon theory will not be 
discussed in this thesis. For this see [92 | I137j . 
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Next, we need some way of acting upon qubits. Upon thinking about the Coulombic 
forces between charged particles, the gravitational forces between massive objects, the in- 
teraction between magnetic dipoles, and so forth, one sees that most interactions appearing 
in nature are pairwise. That is, the total energy of a configuration of n particles is of the 
form 

n 

where Eij depends only on the states of particles i and j. This carries over into quantum 
mechanical systems. Thus, one expects only to directly enact operations on single qubits 
or pairs of qubits. As a simple model of quantum computation, one may suppose that 
one can apply arbitrary unitary operations on individual qubits and pairs of qubits. A 
quantum computation then consists of a polynomially long sequence of such operations. 
From an algorithmic point of view this is considered to be a perfectly acceptable definition 
of a quantum computer. The individual one-qubit and two-qubit unitaries are called gates, 
by analogy to the classical logic gates. The entire sequence of unitaries is called a quantum 
circuit. 

In the quantum circuit model, the input to the computation (the problem instance) is 
the initial state of the qubits prior to being acted upon by the series of unitaries. Since 
human minds are apparently classical, the problems we wish to solve are classical. Thus, 
we will only consider problems whose inputs and outputs are classical bitstrings. We can 
choose two orthogonal states of a given qubit as corresponding to classical and 1. These 
states form a basis for the Hilbert space of the qubit, known as the computational basis. 
The computational basis states of a qubit are conventionally labelled |0) and The 2" 
states obtained by putting each qubit into |0) or |1) form the computational basis basis for 
the 2"-dimensional Hilbert space of the entire system. Rather than labelling these states 

by 

|0) ® |0) . . . |0) 
|0)® |0)(8)...® |1) 

|1) (g) |1) (g) . . . (g) |1) 
it is conventional to simply write them as 

|00 . . . 0) 
I00...1) 

The input to the computation is the computational basis state corresponding to the classical 
bitstring which specifies the problem instance. The output of the computation is the result 

of a measurement in the computational basis. 

This is not a matter of mere notation. Arbitrary quantum states are hard to produce, 
and measurements in arbitrary bases are hard to perform. The special feature of the com- 
putational basis is that it is a basis if tensor product states, that is, in every computational 

basis state, the qubits are completely unentangled. Such states are easy to generate, since 
the qubits need only be put into their states individually without interacting them. Simi- 
larly, the measurement at the end can be performed by measuring the qubits one by one. 
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Figure 1-3: These gates are universal for classical computation. Below each gate is the 
corresponding "truth table" giving the dependence of output on input. 



Classically, any Boolean function can be constructed using only AND, NOT, and FANOUT, 
as shown in figure 11-31 Thus, this set of gates are said to be universal. Similarly, the set 
of two-qubit quantum gates is universal in the sense that any unitary on n-qubits can be 
constructed as a product of such gates. In general, this can require exponentially many 
gates, much like the classical case. The proof that two-qubit gates are universal is given in 
detail in |137j . so I will only sketch it here. The approach is to first show that the set of 
two level unitaries are universal. A two-level unitary is one which interacts two basis states 
unitarily and leaves all other states untouched, as shown below. 

" 1 

Mil Ul2 

1 

1 

1 

Given any arbitrary 2" x 2" unitary, one can left-multiply by a sequence of two-level unitaries 
to eliminate off-diagonal matrix elements one by one. This process is somewhat analogous 
to Gaussian elimination. At the same time, one can also ensure that the remaining diagonal 
elements are all equal to 1. That is, for any unitary U on n qubits, there is some sequence 
of two-level unitaries UmUm-i ■ ■ .U2U1 such that 

UraUra-l-..U2UiU = 1. 

Thus, for any [/, there is a product of two-level unitaries equal to C/^^, which shows that 
two-level unitaries are universal. 

For any given pair of basis states \x) ,\y), one can construct the two level unitary that 
acts on them according to 

jj ^ ■wii U12 
U21 U22 
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by conjugating the single-qubit gate for U with a matrix U-jr that permutes the basis so that 
C/jr \x) and U-j^ \y) differ on a single bit. It is a simple exercise to show that Ut^ can always 
be constructed by a sequence of controlled-not (CNOT) gates. CNOT is a two-qubit gate 
that act on two-qubits according to: 



10 
10 
1 
10 



00 

11 



The bitstrings on the right label the four computational basis states of the two qubits. The 
controlled-not gets its name from the fact that a NOT gate is applied to the second bit (the 
target bit) only if the first bit (the control bit) is 1. 

Although this gate universality result is a very nice first step, it is still not fully satisfying. 
The set of two-qubit gates (4x4 unitary matrices) forms a continuum. An infinite number 
of bits would be necessary to exactly specify particular gate. The same goes for one-qubit 
gates (2x2 unitary matrices). However, this is a surmountable problem. The reason is 
that small deviations from the desired gate will cause only small probability of error in the 
final measurement. This is because the deviations from the desired state caused by each 
gate add at most linearly, which we no show, following [137j . 

Suppose we wish to perform the gate V followed by the gate U. In reality we perform 
imprecise versions of these, V followed by U' . We'll quantify the error introduced by the 
imprecise gates by 

E{U'V')= max \\U'V' {tp) - UV {ip) \\, 
{ip\ip)=i 

which equals 

= max \\{UV - UV' + {UV' - U'V' 

{i/)|i/))=i 

By the triangle inequality this is at most 

< max \\UV\iJj) -UV'\tl;)\\ + \\UV'\tl;) -U'V'l 
('4>\'4>)=i 



By unitarity, this is at most 



Thus 



< max \\V Itp) - V \\ + max \\U {(j)) - U' \ 
{ip\tp)=i {<f'\'i})='^ 

= E{V') + E{U'). 



E{U'V') < E{V') + E{U'). (1.2) 



By equation 11.21 one sees that it is not necessary to obtain higher than polynomial 
accuracy in the gates in order to implement quantum circuits of polynomial size. Hence 
only logarithmically many bits are needed to specify a gate. This result can be improved 
upon in two ways. First, it turns out that it is unnecessary to have even a polynomially 
large set of gates. Instead, arbitrary one and two qubit gates can always be constructed 
with polynomial accuracy using a sequence of logarithmically many gates chosen from some 
finite set of universal quantum gates. This result is known as the Solovay-Kitaev theorem, 
which we state formally below. Universal sets of quantum gates are known with as few as 
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two gates. Secondly, the fault tolerance threshold theorem shows (among other things) that 
it is in fact unnecessary to achieve higher than constant accuracy in implementing each 
gate. Fault tolerance thresholds are discussed in section [T31 

The following is a formal statement of the Solovay-Kitaev theorem adapted from |ll6] . 

Theorem 1 (Solovay-Kitaev). Suppose matrices Ui,...,Ur generate a dense subgroup in 
SU{d). Then, given a desired unitary U £ SU{d), and a precision parameter 6 > 0, there 
is an algorithm to find a product V ofUi,...,Ur and their inverses such that \\V — U\\ < 6. 
The length of the product and the runtime of the algorithm are both polynomial in log(l/5). 

Combining this with the universality of two-qubit unitaries, one sees that any set of one- 
qubit and two-qubit gates that generates a dense subgroup of SU (4) is universal for quantum 
computation. A convenient universal set of quantum gates is the CNOT, Hadamard, and 
7r/8 gates. The CNOT gate we have encountered already in equation II. li The Hadamard 
gate is 

H 

and the vr/S gate is 

T = 

Although two-qubit gates are universal, it does not follow that arbitrary unitaries can 
be constructed efficiently from two-qubit gates. In fact, even the set of 2" x 2" permutation 
matrices (corresponding to reversible computations) is doubly exponentially large, whereas 
the set of polynomial size quantum circuits is only singly exponentially large, given any 
discrete set of gates. Thus, some unitaries on n-qubits require exponentially many gates to 
construct function of n. 

We have now seen that using a discrete set of quantum gates we can construct arbitrary 
unitaries, although some n-qubit unitaries require exponentially many gates. This is in 
some sense a universality result. However, what we are really interested in is computational 
universality. At present it is not yet obvious that one can even efficiently perform universal 
classical computation with such a set of gates. However, it turns out that this is indeed 
possible. It would be surprising if this were not possible, since classical physics, upon which 
classical computaters are based, is a limiting case of quantum physics. Nevertheless showing 
how to specifically implement classical computation with a quantum circuit is not trivial. 
The essential difficulty is that the standard sets of universal classical gates include gates 
which lose information. For example, the AND gate takes two bits of input and produces 
only a single bit of output. There is no way of deducing what the input was just by reading 
the output. In contrast, the quantum mechanical time evolution of a closed system is 
unitary and therefore never loses any information. 

The solution to this conundrum actually predates the field of quantum computation and 
goes by the name of reversible circuits. It turns out that universal classical computation 
can be achieved using gates that have the same number of output bits as input bits, and 
which furthermore never lose any information. That is, the map of inputs to outputs is 
injective. These are called reversible gates. The CNOT gate described in equation 1 1.1 1 is an 
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SWAP 
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a/\b 
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Figure 1-4: The left circuit uses a controllod-SWAP (i.e. a Fredkin gate) to achieve AND 
using one ancilla bit initiahzed to zero. The right circuit uses a Fredkin gate to achieve 
NOT and FANOUT using two anciha bits initiahzed to zero and one. AND, NOT, and 
FANOUT are universal for classical computation, thus classical computation can be per- 
formed reversibly using Fredkin gates and ancilla bits. 



example of a classical reversible gate which has truth table 

00 ^ 00 

01 ^ 01 

10 ^ 11 ■ 

11 10 

By itself, CNOT is not universal. However, the Fredkin gate, or controlled SWAP is. This 
gate has the truth table 

000 ^ 000 

001 ^ 001 

010 ^ 010 

011 Oil 

100 ^ 100 • 

101 110 

110 101 

111 111 

The second pair of bits are swapped only if the first bit is 1. As shown in figure 11-41 
AND, NOT, and FANOUT can all be implemented using the Fredkin gate. In standard 
non-reversible classical circuits one normally takes FANOUT for granted, considering it 
to be achieved by splitting a wire. In the context of reversible computing one must be 
more careful. The FANOUT operation requires the use of an additional bit initialized to 
the state to take the copied value of the bit undergoing FANOUT. In fact, each of the 
constructions shown in figure 11-41 require initialized work bits, known as ancilla bits. This 
is a generic feature of reversible computation because "garbage" bits cannot be erased and 
instead are simply carried to the end of the computation. 

Because AND, NOT, and FANOUT can each be constructed from a single Fredkin gate, 
it follows that taking classical circuits and making them reversible incurs only constant over- 
head. Thus, the set of problems solvable in polynomial time on uniform families of reversible 
circuits is exactly P. On a quantum computer, a reversible 3-qubit gate such as the Fredkin 
gate corresponds to an 3-qubit quantum gate which is an 8 x 8 permutation matrix, permut- 
ing the basis states in accordance with the gate's truth table. Hence reversible computation 
is efficiently achievable on quantum computers. Because of this generic construction, cur- 
rent research on quantum algorithms focuses on quantum algorithms which beat the best 
classical algorithms. Quantum algorithms matching the performance of classical algorithms 
can always be achieved using reversible circuits. 
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Figure 1-5: Garbage bits can be reset to zero. This is done by first performing tlie compu- 
tation, then copying the output into an ancilla register, then using the inverse computation 
to "uncompute" the garbage bits. 



For the purpose of quantum computation it is often important to remove the garbage 
qubits accumulated at the end of a reversible computation, because these can destroy the 
interference needed in quantum algorithms. It is always possible to remove the garbage 
bits by first performing the reversible computation, then using CNOT gates to copy the 
result into a register of ancilla bits initialized to zero, and then reversing the computa- 
tion, as illustrated in figure 11-51 The process of reversing the computation is known as 
uncomputation. 

The quantum circuit model is used as the standard definition of quantum computers. 
The class of problems solvable in polynomial time with quantum circuits is called BQP, 
which stands for Bounded-error Quantum Polynomial-time. The initial state given to the 
quantum circuit must be a computational basis state corresponding to a bitstring encoding 
the problem instance, plus optionally a supply of polynomially many ancilla qubits initial- 
ized to |0). The output is obtained by measuring a single qubit in the computational basis. 
BQP is a class of decision problems, and the measurement outcome is considered to be 
yes or no depending on whether the measurement yields one or zero. A decision problem 
belongs to BQP if there exists a uniform family of quantum circuits whose number of gates 
scales polynomially with the input size n, such that the output is correct with probability 
at least 2/3 for every problem instance. BQP is thus the quantum analogue of BPP. 

A family of quantum circuits is considered to be uniform if the circuit for any given n 
can be generated in poly(n) time by a classical computer. Allowing the family of circuits 
to be generated by a quantum computer does not increase the power of the model. This is 
a consequence of the principle of deferred measurement, as discussed in appendix lEl 

Because probabilities arise naturally in quantum mechanics, most studies of quantum 
computation focus on probabilistic computations and complexity classes. Deterministic 
quantum computation can certainly be defined, and some quantum algorithms succeed 
with probability one while still achieving a speedup over classical computatior0. However, 
restricting to deterministic quantum algorithms seems somewhat artificial. Most of the lit- 
erature on quantum algorithms and complexity assumes the probabilistic setting by default, 
as does this thesis. 

*For example, the Bernstein- Vazirani algorithm achieves this. 
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Recall that MA is the probabilistic version of NP. That is, it is the class of problems 
whose YES instances have probabilistically verifiable witnesses. There are two quantum 
analogues to MA, depending on whether the witnesses are classical or quantum. The set of 
decision problems whose solutions are efficiently verifiable on a quantum computer given a 
classical bitstring as a witness is called QCMA. The set of decision problems whose solutions 
are efficiently verifiable on a quantum computer given a quantum state as a witness is 
called QMA. Many important physical problems are now known to be QMA complete, 
such as computing the ground state energy of arbitrary Hamiltonians made from two-body 
interactions |1 13j . and determining the consistency of a set of density matrices [124j . One 
can also define space bounded quantum computation. BQPSPACE is the class of problems 
solvable with bounded error on a quantum computer with polynomial space and unlimited 
time. Perhaps surprisingly, BQPSPACE = PSPACE [ITO]. (As an aside, NPSPACE = 
PSPACE [ES]!) 

The class of problems solvable by logarithmic depth quantum circuits is called BQNCl. 
This class is potentially relevant for physical implementation of quantum computers because 
if quantum gates can be performed in parallel, then the BQNCl computations can be carried 
out in logarithmic time. This greatly reduces the time one needs to maintain the coherence 
of the qubits. Interestingly, an approximate quantum Fourier transform can be done using 
a logarithmic depth quantum circuit. As a result, factoring can be done with polynomially 
many uses of logarithmic depth quantum circuits, followed by a polynomial amount of 
classical postprocessing[38] . 

As mentioned previously, it is easy to see that problems solvable in classical space /(n) 
are solvable in classical time 2^^"'^ ecause there are only 2-^*^"'^ states that the computer can 
be in. Thus, after 2-1'^'^^ steps the computer must reenter a previously used state and repeat 
itself. Quantum mechanically the situation is different. For any fixed e <C 1, in a Hilbert 
space of dimension d one can fit exponentially many nonoverlapping patches of size e as a 
function of d. (We could define a patch of size e centered at \ip) as {|0) : || |</') — IV') II < e}-) 
Thus there are doubly exponentially many reasonably distinct states of n qubits. Hence 
there is not an analogous argument to show that problems solvable in quantum space 
/(n) are solvable in quantum time exp(/(n)). Nevertheless, this statement is true. It 
can be proven using the previously described universality construction based on two level 
unitaries. Working through the construction in detail one finds that any 2" x 2" unitary 
can be constructed from 0(2^") two level unitaries, and any 2-level unitary on the Hilbert 
space of n qubits can be achieved using O(n^) CNOT gates plus one arbitrary single-qubit 
gate. Thus no computation on n-qubits can require more than 0{2'^""n?) gates. (However, 
finding the appropriate gate sequence may be difficult.) 

1.3 Quantum Algorithms 
1.3.1 Introduction 

By now it is well-known that quantum computers can solve certain problems much faster 
than the best known classical algorithms. The most famous example is that quantum 
computers can factor n-bit numbers in time polynomial in n[160,J, whereas no known classical 
algorithm can do this. The quantum algorithm which achieves this is known as Shor's 
factoring algorithm. As discussed in section II. 2^ a quantum algorithm can be defined as 
a uniform family of quantum circuits, and the running time is the number of gates as a 
function of number of bits of input. 
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Figure 1-6: Quantum oracles must be unitary. One can always achieve this by using separate 
input and output registers. The input register is left unchanged and the output is added 
into the output register bitwise modulo 2. If the input register y is initiahzed 0000 . . ., than 
after applying the oracle it will contain f{x). 
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Quantum algorithms can be categorized into two types based on the method by which 
the problem instance is given to the quantum computer. The most most obvious and 
fundamental way to provide the input is as a bitstring. This is how the input is provided 
to the factoring algorithm. The second way of providing the input to a quantum algorithm 
is through an oracle. The oracular setting is very much analogous to the classical oracular 
setting, with the additional restriction that the oracle must be unitary. Any classical oracle 
can be made unitary by the general technique of reversible computation, as shown in figure 
[THl 



The second most famous quantum algorithm is oracular. The oracle implements the 
function / : {0, 1, . . . , iV} ^ {0, 1} defined by 



1 \i X = w 
otherwise 



The task is to find the "winner" w. Classically, the only way to do this with guaranteed 
success is to query all values of x. Even on average, one needs to query A^/2 values. On 
a quantum computer this can be achieved using 0{y/N) queries [86]. The algorithm which 
achieves this is known as Grover's searching algorithm. The queries made to the oracle 
are superpositions of multiple inputs. Quantum computers cannot solve this problem using 
fewer than r2(\/iV) queries p3|. Brute- force searching is a common subroutine in classical 
algorithms. Thus, many classical algorithms can be sped up by using Grover search as a 
subroutine. Furthermore, quantum algorithms achieve quadratic speedups for searching in 
the presence of more than one winner [3T]. evaluating sum of an arbitrary function [3 U [32| 
I131j . finding the global minimum of arbitrary function [62l [135], and approximating definite 



integrals [139j . These algorithms are based on Grover's search algorithm. 

From a complexity point of view, a quantum algorithm provides an upper bound on the 
quantum complexity of a given problem. It is also interesting to look for lower bounds on 
the quantum complexity problems, or in other words upper limits on the power of quantum 
computers. The techniques for doing so are very different in the oracular versus nonoracular 
settings. 

In the oracular setting, several powerful methods are known for proving lower bounds on 
the quantum query complexity of problems [ 11 [ [2T] . The J7(\/iV) lower bound for searching is 
one example of this. For some oracular problems it is proven that quantum computers do not 
offer any speedup over classical computers beyond a constant factor. For example, suppose 
we are given an oracle computing an arbitrary function of the form / : {0, 1, . . . , A^} — > 
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{0, 1}, and we wish to compute the parity 

N 

0/(^)- 

x=l 

Both quantum and classical computers need Q{N) queries to achive this [67]. 

In the non-oracular setting there are essentiall}@ no known techniques for proving lower 
bounds on quantum (or classical) complexity. One can see however that quantum computers 
cannot achieve super exponential speedups over classical computers because they can be 
classically simulated with exponential overhead. Extending this reasoning, it is clear that 
one could show by diagonalization[141] that for any polynomial p{n) there exist problems 
in EXP which cannot be solved on a quantum computer in time less than p{n). A different 
type of upper bound on the power of quantum computers is that BQP G P*^, as shown 
in[2l]. 

Arguably the most important class of known quantum algorithms from a practical point 
of view are those for quantum simulation. The problem of simulating quantum systems has 
great economic and scientific significance. Many problems, such as the design of new drugs 
and materials, and understanding condensed matter systems such as high temperature 
superconductors, would likely be much easier if quantum many-body systems could be 
efficiently simulated. It seems that this cannot be done on classical computers because the 
dimension of the Hilbert space grows exponentially with the number of degrees of freedom. 
Thus, even writing down the wavefunction would require exponential resources. 

In contrast to classical computers, it is generally believed that standard quantum com- 
puters can efficiently simulate all nonrelativistic quantum systems. That is, the number of 
gates and number of qubits needed to simulate a system of n particles for time t should scale 
polynomially in n and t. The essential reason for this is that Hamiltonians arising in nature 
generally consist of few-body interactions. Few-body interactions can be simulated using 
few-body quantum gates via the Trotter formula. The exact form of the few-body interac- 
tions is irrelevant due to gate universality. Furthermore, even if a Hamiltonian is not a sum 
of few-body terms, it can still be efficiently simulated provided that each row of the matrix 
has at most polynomially many nonzero entries and these entries can be computed efficiently. 
Methods for quantum simulation are described in [72l |43l [l79l [ml El O ESI [1091 [l23]. If a 
physical system were discovered that could not be simulated in polynomial time by a quan- 
tum computer, and that systems could be reliably controlled, then it could presumably be 
used to construct a computer more powerful standard quantum computers. Currently, it 
is not fully known whether relativistic quantum field theory can be efficiently simulated by 
quantum computers. In fact, the task of formulating a well-defined mathematical theory of 
computation based on quantum field theory appears to be difficult. 

Not all quantities that arise in the study of physics are easily computable using quantum 
computers. For example, finding the ground energy of an arbitrary local Hamiltonian is 
QMA-hard |113| . and evaluating the partition function of the classical Potts model is ^P- 
harcifl. Therefore it is unlikely that these problems can be solved in general on a quantum 
computer in polynomial time. It is perhaps not surprising that some partition functions 

^One can prove very weak statements such as the fact that most problems cannot be solved in less than 
the time it takes to read the entire input. Also certain extremely difficult problems, such as optimally playing 
generalized chess, are EXP-complete. These problems provably are not in P. 

®The Potts model partition function is a special case of the Tutte polynomial, as discussed in 5 . It was 
shown in [101] that exact evaluation of the Tutte polynomial at all but a few points is #P-hard. 
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cannot be efficiently evaluated, because partition functions are not directly measurable 
by physical means, and thus not computable by the simulation of a physical process. In 
contrast, information about the eigenenergies of physical systems can be measured by spec- 
troscopy. The problem is, for some systems, the time needed to cool them into the ground 
state may be extremely long. Correspondingly, on a quantum computer, the energy of a 
given eigenstate can be efficiently determined to polynomial precision by the method of 
phase estimation (see appendix [C]) , but there may be no efficient method to prepare the 
ground state. 

Several other quantum algorithms are known. A list of known quantum algorithms is 
given below. I have attempted to be comprehensive, although there are probably a few 
oversights. By known results regarding reversible computation, any classical algorithm can 
be implemented on a quantum computer with only constant overhead. Thus, I only list 
quantum algorithms achieving a speedup over the fastest known classical algorithm. Fur- 
thermore, any quantum circuit solves the problem of computing its own output. Thus to 
keep the list meaningful, I include only quantum algorithms achieving a speedup for a prob- 
lem that could have been stated prior to the concept of quantum computation (although 
not all of these problems necessarily were) . Most quantum algorithms in the literature meet 
this criterion. 



1.3.2 Algebraic and Number Theoretic Problems 



Algorithm: Factoring 
Type: Non-oracular 
Speedup: Super polynomial 

Description: Given an n-bit integer, find the prime factorization. The quantum algo- 
rithm of Peter Shor solves this in poly(n) time |160"] . The fastest known classical algorithm 
requires time superpolynomial in n. This algorithm breaks the RSA cryptosystem. At 
the core of this algorithm is order finding, which can be reduced to the Abelian hidden 
subgroup problem. 



Algorithm: Discrete-log 
Type: Non-oracular 
Speedup: Superpolynomial 

Description: We are given three n-bit numbers a, b, and A'", with the promise that b = 
mod A^ for some s. The task is to find s. As shown by Shor[160j, this can be achieved on 
a quantum computer in poly(n) time. The fastest known classical algorithm requires time 
superpolynomial in n. See also Abelian hidden subgroup. 
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Algorithm: Pell's Equation 
Type: Non-oracular 
Speedup: Super polynomial 

Description: Given a positive nonsquare integer d, Pell's equation is — dy'^ = 1. For 
any such d there are infinitely many pairs of integers (x, y) solving this equation. Let 
(xi, yi) be the pair that minimizes x + yVd. If d is an n-bit integer {i.e. < d < 2"-), then 
{xi,yi) may in general require exponentially many bits to write down. Thus it is in general 
impossible to find (xi,yi) in polynmial time. Let R = log(xi + yiVd). [R\ uniquely 
identifies {xi,yi). As shown by Hallgren|88j. given a n-bit number d, a quantum computer 
can find [R\ in poly(n) time. No polynomial time classical algorithm for this problem is 
known. Factoring reduces to this problem. This algorithm breaks the Buchman- Williams 
cryptosystem. See also Abelian hidden subgroup. 



Algorithm: Principal Ideal 
Type: Non-oracular 
Speedup: Super polynomial 

Description: We are given an n-bit integer d and an invertible ideal / of the ring 
is a principal ideal if there exists a G Q(\/d) such that / = a1j[y/d]. a may be exponentially 
large in d. Therefore a cannot in general even be written down in polynomial time. 
However, [log a] uniquely identifies a. The task is to determine whether / is principal 
and if so find [log a] . As shown by Hallgren, this can be done in polynomial time on a 
quantum computer [88]. Factoring reduces to solving Pell's equation, which reduces to the 
principal ideal problem. Thus the principal ideal problem is at least as hard as factoring 
and therefore is probably not in P. See also Abelian hidden subgroup. 



Algorithm: Unit Group 
Type: Non-oracular 
Speedup: Super polynomial 

Description: The number field Q(^) is said to be of degree d if the lowest degree 
polynomial of which is a root has degree d. The set O of elements of Q(^) which are roots 
of monic polynomials in Z[x] forms a ring, called the ring of integers of Q(^). The set of 
units (invertible elements) of the ring O form a group denoted O*. As shown by Hallgren 
[89], for any Q{9) of fixed degree, a quantum computer can find in polynomial time a set 
of generators for O* , given a description of 6. No polynomial time classical algorithm for 
this problem is known. See also Abelian hidden subgroup. 
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Algorithm: Class Group 
Type: Non-oracular 
Speedup: Super polynomial 

Description: The number field Q{9) is said to be of degree d if the lowest degree 
polynomial of which is a root has degree d. The set O of elements of Q{9) which are 
roots of monic polynomials in forms a ring, called the ring of integers of Q{9). For a 
ring, the ideals modulo the prime ideals form a group called the class group. As shown by 
Hallgren|89j. a quantum computer can find in polynomial time a set of generators for the 
class group of the ring of integers of any constant degree number field, given a description 
of 9. No polynomial time classical algorithm for this problem is known. See also Abelian 
hidden subgroup. 



Algorithm: Hidden Shift 
Type: Oracular 
Speedup: Super polynomial 

Description: We are given oracle access to some function f{x) on a domain of size N. 
We know that f{x) = g{x + s) where g is a known function and s is an unknown shift. 
The hidden shift problem is to find s. By reduction from Grover's problem it is clear 
that at least ^/N queries are necessary to solve hidden shift in general. However, certain 
special cases of the hidden shift problem are solvable on quantum computers using 0(1) 
queries. In particular, van Dam et al. showed that this can be done if / is a multiplicative 
character of a finite ring or field [167]. The previously discovered shifted Legendre symbol 



algorithm [IMl HMj is subsumed as a special case of this, because the Legendre symbol [^j 
is a multiplicative character of Fp. No classical algorithm running in time 0(polylog(A^)) is 
known for these problems. Furthermore, the quantum algorithm for the shifted Legendre 
symbol problem breaks certain classical cryptosystems [167] . 



Algorithm: Gauss Sums 
Type: Non-oracular 
Speedup: Super polynomial 

Description: Let Fg be a finite field. The elements other than zero of Fg form a group 
Fg under multiplication, and the elements of ¥q form an (Abelian but not necessarily 
cyclic) group F+ under addition. We can choose some representation of F^ and some 
representation p+ of F^. Let ^'^^ be the characters of these representations. 
The Gauss sum corresponding to and is the inner product of these characters: 
X^2,_^0GFq (^)- shown by van Dam and Seroussi |168j . Gauss sums can be 

estimated to polynomial precision on a quantum computer in polynomial time. Although 
a finite ring does not form a group under multiplication, its set of units does. Choosing 
a representation for the additive group of the ring, and choosing a representation for the 
multiplicative group of its units, one can obtain a Gauss sum over the units of a finite ring. 
These can also be estimated to polynomial precision on a quantum computer in polynomial 
time |168 j. No polynomial time classical algorithm for estimating Gauss sums is known. 
Furthermore, discrete log reduces to Gauss sum estimation. 
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Algorithm: Abelian Hidden Subgroup 
Type: Oracular 
Speedup: Exponential 

Description: Let G be a finitely generated Abelian group, and let H be some subgroup 
of G such that G/H is finite. Let / be a function on G such that for any gi,g2 G G, 
f{gi) = 7(52) if and only if gi and g2 are in the same coset of H. The task is to find 
H {i.e. find a set of generators for H) by making queries to /. This is solvable on a 
quantum computer using 0(log|G|) queries, whereas classically are required. This 

algorithm was first formulated in full generality by Boneh and Lipton in [30j . However, 
proper attribution of this algorithm is difficult because, as described in chapter 5 of jl37j . 
it subsumes many historically important quantum algorithms as special cases, including 
Simon's algorithm, which was the inspiration for Shor's period finding algorithm, which 
forms the core of his factoring and discrete-log algorithms. The Abelian hidden subgroup 
algorithm is also at the core of the Pell's equation, principal ideal, unit group, and class 
group algorithms. In certain instances, the Abelian hidden subgroup problem can be solved 
using a single query rather than logdCj), see [55] . 



Algorithm: Non-Abelian Hidden Subgroup 
Type: Oracular 
Speedup: Exponential 

Description: Let G be a finitely generated group, and let H be some subgroup of G 
that has finitely many left cosets. Let / be a function on G such that for any 51,92 S G, 
f{gi) = f{g2) if and only if g\ and 52 are in the same left coset of H. The task is to 
find H (i.e. find a set of generators for H) by making queries to /. This is solvable on a 
quantum computer using 0(log(|G|) queries, whereas classically f^dCI) are required [651190] . 
However, this does not qualify as an efficient quantum algorithm because in general, it 
may take exponential time to process the quantum states obtained from these queries. 
Efficient quantum algorithms for the hidden subgroup problem are known for certain 
specific non-Abehan groups [l50l [Ml [ISQl [961 [HI [Ml (Ml [Ml [ffl [751 [771 [l6j . A slightly 
outdated survey is given in [ 125] . Of particular interest are the symmetric group and the 
dihedral group. A solution for the symmetric group would solve graph isomorphism. A 
solution for the dihedral group would solve certain lattice problems [147] . Despite much 
effort, no polynomial-time solution for these groups is known. However, Kuperburg [120] 
found a time 0(2*-"^^°^^) algorithm for finding a hidden subgroup of the dihedral group 
Dn- Regev subsequently improved this algorithm so that it uses not only sub exponential 
time but also polynomial space [148] . 
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1.3.3 Oracular Problems 



Algorithm: Searching 
Type: Oracular 
Speedup: Polynomial 

Description: We are given an oracle with N allowed inputs. For one input w ("the 
winner") the corresponding output is 1, and for all other inputs the corresponding output is 
0. The task is to find w. On a classical computer this requires Q{N) queries. The quantum 
algorithm of Lov Grover achieves this using 0{^/N) queries [86J. This has algorithm has 
subsequently been generalized to search in the presence of multiple "winners" [3ij, evaluate 
the sum of an arbitrary function [31^ [32l I131j . find the global minimum of an arbitrary 
function |62l I135j . and approximate definite integrals [139]. The generalization of Grover's 
algorithm known as amplitude estimation [33] is now an important primitive in quantum 
algorithms. Amplitude estimation forms the core of most known quantum algorithms 
related to collision finding and graph properties. 



Algorithm: Bernstein- Vazirani 
Type: Oracular 
Speedup: Polynomial 

Description: We are given an oracle whose input is n bits and whose output is one bit. 
Given input x G {0,1}"", the output is x /i, where h is the "hidden" string of n bits, 
and denotes the bitwise inner product modulo 2. The task is to find h. On a classical 
computer this requires n queries. As shown by Bernstein and Vazirani j2l]. this can be 
achieved on a quantum computer using a single query. Furthermore, one can construct 
a recursive version of this problem, called recursive Fourier sampling, such that quantum 
computers require exponentially fewer queries than classical computers [24]. 



Algorithm: Deutsch-Josza 
Type: Oracular 
Speedup: Polynomial 

Description: We are given an oracle whose input is n bits and whose output is one bit. 
We are promised that out of the 2"' possible inputs, either all of them, none of them, 
or half of them yield output 1. The task is to distinguish the balanced case (half of all 
inputs yield output 1) from the constant case (all or none of the inputs yield output 1). 
It was shown by Deutsch^57j that for n = 1, this can be solved on a quantum computer 
using one query, whereas any deterministic classical algorithm requires two. This was 
historically the first well-defined quantum algorithm achieving a speedup over classical 
computation. The generalization to arbitrary n was developed by Deutsch and Josza in 
[58j . Although probabilistically easy to solve with 0(1) queries, the Deutsch-Josza problem 
has exponential worst case deterministic query complexity classically. 
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Algorithm: NAND Tree 
Type: Oracular 
Speedup: Polynomial 

Description: A NAND gate takes two bits of input and produces one bit of output. By 
connecting together NAND gates, one can thus form a binary tree of depth n which has 
2" bits of input and produces one bit of output. The NAND tree problem is to evaluate 
the output of such a tree by making queries to an oracle which stores the values of the 2"' 
bits and provides any specified one of them upon request. Farhi et al. used a continuous 
time quantum walk model to show that a quantum computer can solve this problem using 
0{2^'^^) time whereas a classical computer requires 0(2*^'''^^") time[66]. It was soon shown 
that this result carries over into the conventional model of circuits and queries [45j. The 
algorithm was subsequently generalized for NAND trees of varying fanin and noniform 
depth [13], and to trees involving larger gate sets [149], and MIN-MAX trees (47] . 



Algorithm: Gradients 
Type: Oracular 
Speedup: Polynomial 

Description: We are given a oracle for computing some smooth function / : M*^ — > M. The 
inputs and outputs to / are given to the oracle with finitely many bits of precision. The 
task is to estimate V/ at some specified point xq G M"^. As I showed in [107] . a quantum 
computer can achieve this using one query, whereas a classical computer needs at least d+1 
queries. In [36], Bulger suggested potential applications for optimization problems[36j. As 
shown in appendix [Dl a quantum computer can use the gradient algorithm to find the 
minimum of a quadratic form in d dimensions using 0{d) queries, whereas, as shown in 
[T77j . a classical computer needs at least queries. 



Algorithm: Ordered Search 
Type: Oracular 
Speedup: Constant 

Description: We are given oracle access to a list of N numbers in order from least to 
greatest. Given a number x, the task is to find out where in the list it would fit. Classically, 
the best possible algorithm is binary search which takes log2 queries. Farhi et al. showed 
that a quantum computer can achieve this using 0.531og(iV) queries [68]. Currently, the 
best known deterministic quantum algorithm for this problem uses 0.433 log2iV queries. 
A lower bound of ^ log2 quantum queries has been proven for this problem[42J. In [22] . 
a randomized quantum algorithm is given whose expected query complexity is less than 
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Algorithm: Graph Properties 
Type: Oracular 
Speedup: Polynomial 

Description: A common way to specify a graph is by an oracle, which given a pair of 
vertices, reveals whether they are connected by an edge. This is called the adjacency 
matrix model. It generalizes straightforwardly for weighted and directed graphs. Building 
on previous work |62l 1941 [63]. Diirr et al. [61j show that the quantum query complexity of 
finding a minimum spanning tree of weighted graphs, and deciding connectivity for directed 
and undirected graphs have B(n^/^) quantum query complexity, and that finding lowest 
weight paths has 0(n^/'^ log'^ n) quantum query complexity. Berzina et al. [26] show that 
deciding whether a graph is bipartite can be achieved using quantum queries. All 

of these problems are thought to have classical query complexity. For many of these 

problems, the quantum complexity is also known for the case where the oracle provides an 
array of neighbors rather than entries of the adjacency matric[6T]. See also triangle finding. 

Algorithm: Welded Tree 
Type: Oracular 
Speedup: Exponential 

Description: Some computational problems can be phrased in terms of the query 
complexity of finding one's way through a maze. That is, there is some graph G to which 
one is given oracle access. When queried with the label of a given node, the oracle returns 
a list of the labels of all adjacent nodes. The task is, starting from some source node {i.e. 
its label), to find the label of a certain marked destination node. As shown by Childs et 
a/. [31], quantum computers can exponentially outperform classical computers at this task 
for at least some graphs. Specifically, consider the graph obtained by joining together two 
depth-n binary trees by a random "weld" such that all nodes but the two roots have degree 
three. Starting from one root, a quantum computer can find the other root using poly(n) 
queries, whereas this is provably impossible using classical queries. 

Algorithm: Collision Finding 
Type: Oracular 
Speedup: Polynomial 

Description: Suppose we are given oracle access to a two to one function / on a domain 
of size N. The collision problem is to find a pair x, y E {1, 2, . . . , N} such that f{x) = f{y). 
The classical randomized query complexity of this problem is Q{^/N), whereas, as shown 
by Brassard et al., a quantum computer can achieve this using 0{N^^^) queriesi34]. 
Buhrman et al. subsequently showed that a quantum computer can also find a collision in 
an arbitrary function on domain of size N, provided that one exists, using 0(A3/4 ^) 
queries ^37j, whereas the classical query complexity is O(AlogA). The decision version 
of collision finding is called element distinctness, and also has O(A^logA) classical query 
complexity. Ambainis subsequently improved upon[3l], achieving a quantum query 
complexity of 0{N'^^^) for element distinctness, which is optimal, and extending to the 
case of fc-fold collisions |12j. Given two functions / and g, each on a domain of size N, 
a claw is a pair x,y such that f{x) = g{y). A quantum computer can find claws using 
0{N^/^ log N) queries [37] . 
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Algorithm: Triangle Finding 
Type: Oracular 
Speedup: Polynomial 

Description: Suppose we are given oracle access to a graph. When queried with a pair of 
nodes, the oracle reveals whether an edge connects them. The task is to find a triangle {i.e. 
a clique of size three) if one exists. As shown by Buhrman et al. [37], a quantum computer 
can accomplish this using queries, whereas it is conjectured that classically one 

must query all (2) edges. Magniez et al. subsequently improved on this, finding a triangle 
with 0(Ai3/iO) quantum queries fl26j . 

Algorithm: Matrix Commutativity 
Type: Oracular 
Speedup: Polynomial 

Description: We are given oracle access to k matrices, each of which are n x n. Given 
integers i,j G {1, 2, . . . , n}, and x € {1, 2, . . . ,k} the oracle returns the ij matrix element of 
the x^^ matrix. The task is to decide whether all of these k matrices commute. As shown 
by Itakura[97|. this can be achieved on a quantum computer using 0(A;^/^n^/^) queries, 
whereas classically this requires 0{kn'^) queries. 



Algorithm: Hidden Nonlinear Structures 
Type: Oracular 
Speedup: Exponential 

Description: Any Abelian groups G can be visualized as a lattice. A subgroup H of 
G is a sublattice, and the cosets of H are all the shifts of that sublattice. The Abelian 
hidden subgroup problem is normally solved by obtaining superposition over a random 
coset of the Hidden subgroup, and then taking the Fourier transform so as to sample 
from the dual lattice. Rather than generalizing to non-Abelian groups (see non-Abelian 
hidden subgroup), one can instead generalize to the problem of identifying hidden subsets 
other than lattices. As shown by Childs et a/. [39] this problem is efficiently solvable on 
quantum computers for certain subsets defined by polynomials, such as spheres. Decker et 
al. showed how to efficiently solve some related problems in [56]. 



Algorithm: Order of Blackbox Group 
Type: Oracular 
Speedup: Exponential 

Description: Suppose a finite group G is given oracularly in the following way. To every 
element in G, one assigns a corresponding label. Given an ordered pair of labels of group 
elements, the oracle returns the label of their product. The task is to find the order of the 
group, given the labels of a set of generators. Classically, this problem cannot be solved 
using polylog(|G|) queries even if G is Abelian. For Abelian groups, quantum computers 
can solve this problem using polylog(|G|) queries by reducing it to the Abelian hidden 
subgroup problem, as shown by Mosca [T32] . Furthermore, as shown by Watrous [169j . this 
problem can be solved in polylog(|G|) queries for any solvable group. 
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1.3.4 Approximation and BQP-complete Problems 



Algorithm: Quantum Simulation 
Type: Non-oracular 
Speedup: Exponential 

Description: It is believed that for any physically realistic Hamiltonian H on n degrees 
of freedom, the corresponding time evolution operator e~*^* can be implemented using 
poly(?i, t) gates. Unless BPP=BQP, this problem is not solvable in general on a classical 
computer in polynomial time. Many techniques for quantum simulation have been 
developed for different applications 031 1791 [3 O [Ml [IMl 123]. The exponential 
complexity of classically simulating quantum systems led Feynman to first propose that 
quantum computers might outperform classical computers on certain tasks [72]. 



Algorithm: Jones Polynomial 
Type: Non-oracular 
Speedup: Exponential 

Description: As shown by Freedman|741 173] . et al., finding a certain additive approxi- 
mation to the Jones polynomial of the plat closure of a braid at e*^'^/^ is a BQP-complete 
problem. This result was reformulated and extended to e*^'^/'^ for arbitrary k by Aharonov 
et aZ.[6l|4]. Wocjan and Yard further generalized this, obtaining a quantum algorithm to 
estimate the HOMELY polynomial[174j, of which the Jones polynomial is a special case. 
Aharonov et al. subsequently showed that quantum computers can in polynomial time 
estimate a certain additive approximation to the even more general Tutte polynomial for 
planar graphs [5] . The hardness of the additive approximation obtained in [5] is not yet 
fully understood. As discussed in chapter [3| of this thesis, the problem of finding a certain 
additive approximation to the Jones polynomial of the trace closure of a braid at e*^'^/^ is 
DQCl-complete. 



Algorithm: Zeta Functions 
Type: Non-oracular 
Speedup: Super polynomial 

Description: As shown by Kedlaya |112j . quantum computers can determine the zeta 
function of a genus g curve over a finite field ¥q in time polynomial in g and logg. No 
polynomial time classical algorithm for this problem is known. More speculatively, van 
Dam has conjectured that due to a connection between the zeros of zeta functions and the 
eigenvalues of certain quantum operators, quantum computers might be able to efficiently 
approximate the number of solutions to equations over finite fields |165j . Some evidence 
supporting this conjecture is given in |165] . 
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Algorithm: Weight Enumerators 
Type: Non-oracular 
Speedup: Exponential 

Description: Let C a code on n bits, i.e. a subset of The weight enumerator of 
C is Sc{x,y) = X^cec where |c| denotes the Hamming weight of c. Weight 

enumerators have many uses in the study of classical codes. If C is a linear code, 
it can be defined by C = {c : Ac = 0} where ^ is a matrix over Z2. In this case 
Sc{x,y) = Y^c-Ac=o^^^^y"'~^^^ ■ Quadratically signed weight enumerators (QWGTs) are a 
generalization of this: S{A, B,x,y) = Ylc Ac=o(~^y Bc^\c\yn-\c\^ Now consider the follow- 
ing special case. Let yl be an n x n matrix over Z2 such that diag(^) = I. Let lwtr(74) be 
the lower triangular matrix resulting from setting all entries above the diagonal in A to zero. 
Let l,k be positive integers. Given the promise that \S{A,lwtT{A),k,l)\ > \{k'^ + /^)"/^, 
the problem of determining the sign of S{A,\wii{A),k,l) is BQP-complete, as shown by 
Knill and Laflamme in [119]. The evaluation of QWGTs is also closely related to the 
evaluation of Ising and Potts model partition functions [1221 1751 ESI I80j . 

Algorithm: Simulated Annealing 
Type: Non-oracular 
Speedup: Polynomial 

Description: In simulated annealing, one has a series of Markov chains defined by stochas- 
tic matrices Mi , M2 , . . . , M„ . These are slowly varying in the sense that their limiting 
distributions vri, 712, . . . , 7r„ satisfy Ivr^+i — 'Kt\ < e for some small e. These distributions can 
often be though of as thermal distributions at successively lower temperatures. If vri can 
be easily prepared then by applying this series of Markov chains one can sample from vr^. 
Typically, one wishes for tt^ to be a distribution over good solutions to some optimization 
problem. Let 5i be the gap between the largest and second largest eigenvalues of Mj. Let 
6 = miuj^j. The run time of this classical algorithm is proportional to 1/5. Building 
upon results of Szegedy |162j . Somma et al. have shown p^T] that quantum computers can 
sample from vr^ with a runtime proportional to Xj^fb. 

Algorithm: String Rewriting 
Type: Non-oracular 
Speedup: Exponential 

Description: String rewriting is a fairly general model of computation. String rewriting 
systems (sometimes called grammars) are specified by a list of rules by which certain 
substrings are allowed to be replaced by certain other substrings. For example, context 
free grammars, are equivalent to the pushdown automata. In |103j . Janzing and Wocjan 
showed that a certain string rewriting problem is PromiseBQP-complete. Thus quantum 
computers can solve it in polynomial time, but classical computers probably cannot. Given 
three strings s, and t', and a set of string rewriting rules satisfying certain promises, the 
problem is to find a certain approximation to the difference between the number of ways 
of obtaining t from s and the number of ways of obtaining t' from s. Similarly, certain 
problems of approximating the difference in number of paths between pairs of vertices in a 
graph, and difference in transition probabilities between pairs of states in a random walk 
are also BQP-complete [T02| . 
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Algorithm: Matrix Powers 
Type: Non-oracular 
Speedup: Exponential 

Description: Quantum computers have an exponential advantage in approximating 
matrix elements of powers of exponentially large sparse matrices. Suppose we are have an 
N X N symmetric matrix A such that there are at most polylog(A^) nonzero entries in each 
row, and given a row index, the set of nonzero entries can be efficiently computed. The 
task is, for any 1 < i < A^, and any m polylogarithmic in A^, to approximate the 
i^^ diagonal matrix element of A"^. The approximation is additive to within b"^e, where 
6 is a given upper bound on \\A\\ and e is of order l/polylog(A^). As shown by Janzing 
and Wocjan, this problem is PromiseBQP-complete, as is the corresponding problem for 
off-diagonal matrix elements |104| . Thus, quantum computers can solve it in polynomial 
time, but classical computers probably cannot. 

Algorithm: Verifying Matrix Products 
Type: Non-oracular 
Speedup: Polynomial 

Description: Given three n x n matrices, A, B, and C, the matrix product verification 
problem is to decide whether AB = C. Classically, the best known algorithm achieves this 
in time O(n^), whereas the best known classical algorithm for matrix multiplication runs 
in time 0(n^'^^^). Ambainis et al. discovered a quantum algorithm for this problem with 
runtime 0(n^/^) [lOj. Subsequently, Buhrman and Spalek improved upon this, obtaining 
a quantum algorithm for this problem with runtime 0(n^/^) [35j. This latter algorithm is 
based on results regarding quantum walks that were proven in|162]. 



1.3.5 Commentary 

As noted above, some of these algorithms break existing cryptosystems. I mention this not 
because I care about breaking cryptosystems, but because public key cryptosystems serve as 
a useful indicator of the general consensus regarding the computational difficulty of certain 
mathematical problems. For each known public key cryptosystem, the security proof rests 
on an assumption that a certain mathematical problem cannot be solved in polynomial 
time. As discussed in section 11.11 nobody knows how to prove that these problems cannot 
be solved in polynomial time. However, some of these problems, such as factoring, have 
resisted many years of attempts at polynomial time solution. Thus, many people consider 
it to be a safe assumption that factoring is hard. People and corporations also effectively 
wager money on this, as nearly all monetary transactions on the internet are encoded using 
the RSA cryptosystem, which is based on the assumption that factoring is hard to solve 
classically. 

Many of the problems solved by these quantum algorithms may seem somewhat esoteric. 
Upon hearing that quantum computers can approximate Tutte polynomials or solve Pell's 
equation, one may ask "Why should I care?" . One answer to this is that mathematical algo- 
rithms sometimes have applications which are not discovered until long after the algorithm 
itself. A deeper answer is that complexity theory has shown that the ability to solve hard 
computational problems is to some degree a fungible resource. That is, many hard problems 
reduce to one another with polynomial overhead. By finding a polynomial time solution for 
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one hard problem, one obtains polynomial time solutions for a class of problems that can 
appear unrelated. An interesting example of this is the LLL algorithm, for the apparently 
esoteric problem of finding a basis of short vectors for a lattice. This has subsequently 
found application in cryptography, error correction, and finding integer relations between 
numbers. LLL and subsequent variants for integer relation finding have even found use in 
computer assisted mathematics. Their achievements include, among other things, the dis- 
covery of a new formula for the digits of vr such that any digit of vr can be calculated using 
a constant amount of computation without having to calculate the preceeding digits [18]! 

In light of such history, and in light of known properties of computational complexity, 
it makes sense to search for quantum algorithms that provide polynomial speedups for 
problems of direct practical relevance, and to try to find exponential speedups for any 
problem. 

1.4 What makes quantum computers powerful? 

The quantum algorithms described in section [L3l establish rigorously that quantum comput- 
ers can solve some problems using far fewer queries than classical computers, and establish 
convincingly that quantum computers can solve certain problems using far fewer computa- 
tional steps than classical computers. It is natural to ask what aspect of quantum mechanics 
gives quantum computers their extra computational power. At first glance this appears to 
be a vague and ill-posed question. However, we can approach this question in a concrete way 
by taking away different aspects of quantum mechanics one at a time, and seeing whether 
the resulting models of computation retain the power of quantum computers. 

One necessary ingredient for the power of quantum computing is the exponentially high- 
dimensional Hilbert space. If we take this away, then the resulting model of computation 
can be simulated in polynomial time by a classical compute. To simulate the action of each 
gate, one would need only to multiply the state vector by a unitary matrix of polynomial 
dimension. Interestingly, classical optics can be described by a formalism that is nearly 
identical to quantum mechanics, the only difference being that the amplitudes are a func- 
tion of three spatial dimensions rather than an arbitrary number of degrees of freedom. As 
described in appendix [HI this analogy was fruitful in that it led me to discover a quan- 
tum algorithm for estimating gradients faster than is possible classically [107]. Intuitions 
from optics were apparently also used in the development of quantum algorithms for the 
identification of hidden nonlinear structures [39j. 

We have seen that an exponentially large state space is a necessary ingredient for the 
power of quantum computers. However, the states of a probabilistic computer live in a vector 
space of exponentially high dimension too. The state of a probabilistic computer with n 
bits is a vector in the 2" dimensional space of probability distributions over its possible 
configurations. The essential difference is that quantum systems can exhibit interference 
due to the cancellation of amplitudes. In contrast probabilities are all positive and cannot 
interfere. 

In light of this comparison between quantum and probabilistic computers, it is natural 
to ask whether it is necessary that the amplitudes be complex for quantum computers to 
retain their power. After all, real amplitudes can still interfere as long as they are allowed 
to be both positive and negative. It turns out that real amplitudes are sufficient to obtain 
BQP [159j . The proof of this is based on the following simple idea. Take an arbitrary state 



38 




Figure 1-7: A diagram of (loose) conceptual relationships between quantum mechanics, 
classical optics, and probability. Correspondingly, by taking away interference from quan- 
tum computers, one is left with the power of probabilistic computers, and by taking away 
the exponentially high dimensional space of quantum states, one is left with the power of 
optical computing. Interestingly, the Fourier transform is an important primitive in both 
quantum and optical computing. 
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One can encode it by the following real state on n -|- 1 qubits 

2"-l 
x=0 

As shown in jl59j , for each quantum gate, an equivalent version on the encoded states can 
be efficiently constructed. As a result, arbitrary quantum computations can be simulated 
using only real amplitudes. 

It is often said that entanglement is a key to the power of quantum computers. A 
completely unentangled pure state on n-qubits is always of the form 

|V'l)®l^2)®...®|V'n) 

where \ipi) , . . . , iV'n) are each single-qubit states. Each of these states can be described by 
a pair of complex amplitudes. Thus, the unentangled states are described by 2n complex 
numbers in contrast to arbitrary states which in general require 2". Hence, it is not sur- 
prising that quantum computers must use entangled states in order to obtain speedup over 
classical computation. 

Both interference and an exponentially high-dimensional state space seem to be nec- 
essary to the power of quantum computation. Nevertheless, there are classes of quantum 
processes which involve both of these characteristics yet can be simulated classically in 
polynomial time. Certain quantum states admit concise group-theoretic description. The 
Pauli group Pn on n qubits is the group of n-fold tensor products of the four Pauli matrices 
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{X, Y, Z, 1} with phases of ±1 and zizi. 
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The states stabihzed by subgroups of the PauH group are called stabilizer states. Any 
stabilizer state on n qubits can be concisely described using poly(n) bits by listing a set 
of generators for its stabilizer subgroup. The Clifford group is the normalizer of the Pauli 
group. As discussed in [83], it is generated by CNOT, Hadamard, and 
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Applying a Clifford group operation to a stabilizer state results in another stabilizer state. 
Thus quantum circuits made from gates in the Clifford group can be efficiently simulated 
on a classical computer [S2], provided the initial state is a stabilizer state. This is possible 
even though stabilizer states can be highly entangled and can involve both positive and 
negative amplitudes. This result is known as the Gottesman-Knill theorem. 

In addition, many quantum states with limited but nonzero entanglement can be con- 
cisely described using the matrix product state (MPS) and projected entangled pair state 
(PEPS) formalisms. Matrix product states can have amplitudes of all phases. Neverthe- 
less, processes on MPS and PEPS with limited entaglement can be efficiently simulated on 
classical computers [HHl [157]. 



1.5 Fault Tolerance 

Quantum computation is not the first model of physical computation to offer an apparent 
exponential advantage over standard digital computers. Certain analog circuits and even 
mechanical devices have seemed to achive exponential speedups for some problems. How- 
ever, closer inspection has always shown that these devices depend on exponential precision 
to operate, thus the speedups offered are physically unrealistic (see introduction of [160] 
and references therein) . This is one reason why we rarely see discussion of analog computers 
today. 

One of the most sensible objections raised in the early days of quantum computing was 
that quantum computers might be a form of analog computer, dependent on exponential 
precision in order to achieve exponential speedup. The discussion of section [L2] shows that 
this is not true. To perform a computation on poly(n) gates, one needs only to perform 
each gate with l/poly(n) precision. However, from a practical point of view, this seems not 
entirely satisfactory. Presumably the precision achievable in the laboratory is limited. Thus 
even if the precision necessary for computations of size n is only l/poly(n), the achievable 
computations will be limited to some maximum size. In addition to gate imperfections, 
errors can arise from stray couplings to the environment, which is ignored in the analysis 
of section 11.21 

The threshold theorem shows that both of these problems are solvable in principle. More 
precisely, the threshold theorem shows that if errors are below a certain fixed threshold, 
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then quantum computations of unlimited length can be carried out reliably(see chapter 10 
of |137j ). This is achieved by encoding the quantum information using quantum error cor- 
recting codes, and continually correcting errors as they occur throughout the computation. 
It does not matter whether the error arises from gate imperfections or from stray influence 
from the environment, as long as the total error rate is below the fault tolerance threshold. 

Any operator on n qubits can be uniquely decomposed as a linear combination of n- 
fold tensor products of the Pauli matrices {X,Y, Z, I}. The number of non-identity Pauli 
matrices in a given tensor product is called its weight. If the Pauli decomposition of an 
operator consists only of tensor products of weight at most k, then the operator is said to 
be /c-local. 

The essence of quantum error correction is to take advantage of the fact that errors 
encountered are likely to be of low Pauli weight. Suppose for example, that each qubit gets 
flipped in the computational basis {i.e. acted on by an X operator) with probability p. 
Then, the probability that the resulting error is of weight k is of order p^. lip is small, then 
with high probability the errors will be of low weight. The error model described here is an 
essentially classical one, but as discussed in |137j . the same conclusion carries through when 
considering errors other than bitflips, coherent superpositions rather than probabilistic mix- 
tures of corrupted and uncorrupted states, and errors arising from persistent perturbations 
to the control Hamiltonian rather than discrete "kicks" . 

An [n, k] quantum code is a 2'^-dimensional subspace of the 2^-dimensional Hilbert space 
of n qubits. Thus, it encodes k logical qubits using n physical qubits. Let P be the projector 
on to the code. Suppose that there is a discrete set of possible errors {Ei, E2, . . . , Em} 
that we wish to correct. Then for error correction it suffices for EiP, E2P, . . . , EmP to be 
mutually orthogonal, because then the errors can be distinguished and hence corrected. Of 
course, in the quantum setting, the possible errors may form a continuum, but as discussed 
in [137j . this problem can be avoided by making an appropriate measurement to collapse the 
system into one of a discrete set of errors. 

As an alternative to the active correction of errors, schemes have been proposed in 
which the physical system from which the quantum computer is constructed has intrinsic 
resistance to errors. One of the earliest examples of this is the Kitaev's quantum memory 
based on toric codes [HTj. The toric code is an [/^,2] code defined on an Z x / square lattice 
of qubits on a torus. It has the property that any error of weight less than / is correctable. 
Furthermore, one can construct a 4-local Hamiltonian with a 4-fold degenerate ground space 
equal to the code space. The Hamiltonion provides an energy penalty against any error of 
weight / or less. Thus, if the ambient temperature is small compared to this energy penalty, 
the system is unlikely to get kicked out of the ground space. Furthermore, c-local error 
terms in the Hamiltonian only cause splitting of the ground space degeneracy at order I / c 
in perturbation theory. Topological quantum computation is closely related to toric codes 
and is also a promising candidate for intrinsically robust quantum computation [T36]. 

The active schemes of quantum error correction generally yield very low fault tolerance 
thresholds which are difficult to achieve experimentally. Furthermore, the amount of over- 
head incurred by the error correction process can be very large if the noise only slightly 
below the threshold. The passive schemes of error protection may reduce or eliminate the 
need for costly active error correction. In chapter El I investigate the fault tolerance of 
adiabatic quantum computers and find that such passive error protection schemes show 
promise for the adiabatic model of quantum computation. Although adiabatic quantum 
computation has attractive features for experimentalists, particularly regarding solid state 
qubits, no threshold theorem for the adiabatic model of quantum computation is currently 
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known. I see the establishment of a threshold theorem for adiabatic quantum computers as 
a major open problem in the theory of quantum fault tolerance. 



1.6 Models of Quantum Computation 

In section 11.21 1 discussed the universality of quantum circuits, and the reasons to believe 
that no model of quantum computation is more powerful than the quantum circuit model. 
That is, no discrete nonrelativistic quantum system is capable of efficiently solving prob- 
lems outside of BQP. Furthermore, as discussed in section 11.51 quantum circuits can be 
fault tolerant in the sense that they can accurately perform arbitrarily long computations 
provided the error rate is below a certain threshold. Thus, in principle, the quantum circuit 
model is the only model we need for the both study quantum algorithms, and the physical 
implementation of quantum computers. In practice however, for both the development of 
new quantum algorithms and the physical construction of quantum computers it has proven 
useful to have alternative models of quantum computation. 



1.6.1 Adiabatic 

In the adiabatic model of quantum computation, one starts with an initial Hamiltonian 
with an easy to prepare ground state, such as lO)*^". Then, the Hamiltonian is slowly varied 
until it reaches some final Hamiltonian whose ground state encodes the solution to some 
computational problem. The adiabatic theorem shows that if the Hamiltonian is varied 
sufficiently slowly and the energy gap between the ground state and first excited state is 
sufficiently large, then the system will track the instantaneous ground state of the time- 
varying Hamiltonian. More precisely, suppose the Hamiltonian is H{t), and the evolution is 
from t = to t = T. Let 7(t) be the gap between the ground energy and first excited energy 
at time t. Let 7 = mino<t<r 7(i)- Then the necessary runtime to ensure high overlap of the 
final state with the final ground state scales as l/poly(7). A rough analysis [128j suggests 
that the runtime should in fact scale as 1/7^. However, it is not clear that this holds as a 
rigorous theorem for all cases. Nevertheless, rigorous versions of the adiabatic theorem are 
known. For example, in appendix[F]we reproduce an elegant proof due to Jeffrey Goldstone 
of the following theorem: 

Theorem 2. Let H{s) be a finite- dimensional twice differentiable Hamiltonian on < s < 1 
with a nondegenerate ground state |(/>o(s)) separated by an energy energy gap 7(5). Let \ip{t)) 
be the state obtained by Schrodinger time evolution with Hamiltonian H{t/T) starting with 
state |0o(O)) at t = 0. Then, with appropriate choice of phase for \(j)Q(t)), 

II mn - \MT)) \\<^[^ \mLo + w II^IL=i + /o d« {^\m\" + ^ ||0||)^ 

Schrodinger 's equation shows that, for any constant g, the time-dependent Hamiltonian 
gH{gt) yields the same time evolution from time to T/g that H{t) yields from to T. 
Thus, the running time of an adiabatic algorithm would not appear to be well defined. 
However, in any experimental realization there will be a limit to the magnitude of the fields 
and couplings. Thus it is reasonable to limit the norm of each local term in H{t). Such 
a restriction enables one to make statements about how the running time of an adiabatic 
algorithm scales with some measure of the problem size. An alternative convention is to 
simply normalize to 1. 
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Adiabatic quantum computation was first proposed as a method to solve combinato- 
rial optimization problems [69]. The spectral gap, and hence the runtime, of the proposed 
adiabatic algorithms for combinatorial optimization remain unknown. Quantum circuits 
can simulate adiabatic quantum computers with polynomial overhead using standard tech- 
niques of quantum simulation. In [8j it was shown that adiabatic quantum computers can 
simulate arbitrary quantum circuits with polynomial overhead. Thus, up to a polynomial 
factor, adiabatic quantum computers are equivalent to the quantum circuit model. In other 
words, the set of problems solvable in polynomial time by adiabatic quantum computers is 
exactly BQP. 

In [8] , Aharonov et al. present a construction for doing universal quantum computation 
with a 5-local Hamiltonian. The minimum eigenvalue gap is proportional to where g 

is the number of gates in the circuit being simulated. Assuming quadratic scaling of runtime 
with the inverse gap, this implies a quartic overhead. They also show how to achieve univer- 
sal adiabatic quantum computation with 3- local Hamiltonians and a runtime of 0{l/g^'^). 
Using the perturbative gadgets of |113j this can be reduced to 2-local with further overhead 
in runtime. These runtimes were subsequently greatly improved. Using a clever construc- 
tion of Nagaj and Moses [TM] . one can achieve universal adiabatic quantum computation 
using a 3- local Hamiltonian with a gap of order l/g"^ throughout the computatioiu. 

When using adiabatic quantum computation as a method of devising algorithms rather 
than as an architecture for building quantum computers, one can consider simulatable 
Hamiltonians, which are a larger class than physically realistic Hamiltonians. As shown 
in[7l[25j, sparse Hamiltonians can be efficiently simulated on quantum circuits even if they 
are not local. Furthermore, if the adiabatic algorithm runs in time T then, the simulation 
can be accomplished in time T^+'^f' using a k^^ order Suzuki- Trotter formula. 

Several reasons have been proposed for why adiabatic quantum computers might be 
easier to physically implement than standard quantum computers. The standard architec- 
ture for physically implementing quantum computation is based on the quantum circuit 
model. Each gate is performed by applying a pulse to the relevant qubits. For example, 
in an ion trap quantum computer, a laser pulses are used to manipulate the electronic 
state of ions. In any such pulse-based scheme, it takes a large bandwidth to transmit the 
control pulses to the qubits. This therefore leaves a large window open for noise to enter 
the system and disturb the qubits. In contrast, in an adiabatic quantum computer, all the 
control is essentially DC, and therefore much of the noise other than that at extremely low 
frequencies can be filtered out[59l I152j . Secondly, as a consequence of the adiabatic theo- 
rem, if the Hamiltonian H(t) drifts off course during the computation, then the adiabatic 
algorithm will still succeed provided that the initial and final Hamiltonians are correct, 
and adiabaticity is maintained. Furthermore, dephasing in the eigenbasis of H{t) causes 
no decrease in the success probability. Lastly, H(t) is applied constantly. If the minimum 
energy gap between the ground and first excited states is 7 and the ambient temperature 
kT is less than 7, then the system will be unlikely to get thermally excited out of its ground 
state |41]. Unfortunately, in most adiabatic algorithms, 7 scales inversely with the problem 
size, apparently necessitating progressively lower temperatures to solve larger problems. A 
technique for getting around this problem is discussed in chapter [2l 



^Surprisingly, the authors do not exphcitly state in [134] that their construction can be used for this 
purpose. 
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1.6.2 Topological 



Topological quantum computation is a model of quantum computation based on the braiding 
of a certain type of quasiparticles called anyons, which can arise in quasi-two-dimensional 
many-body systems. The energy of the system depends only on how many quasiparticles 
are present. By adiabatically dragging these particles around one another in two dimensions 
and back to their original locations, one may incur a Berry's phase. In certain systems, this 
phase has the special property that it depends only on the topology of the path and not 
on the geometry. The phase induced by the braiding of n identical particles will thus be a 
representation of the n-strand braid group. Particles with this property are called anyons. 
If the space of n-particle states is d-fold degenerate, then by braiding the particles around 
each other, one can move the system within this degenerate space. The "phase" in this case 
is a d-dimensional unitary representation of the n-strand braid group. The representation 
can thus be non-Abelian, in which case the particles are said to be non-Abelian anyons. 

Not all representations of the braid group correspond to anyons that are physically 
realized. This is because in addition to winding around each other, anyons can be fused. 
For example, consider the Abelian representation of the braid group where the clockwise 
swapping of a pair of particles induces a phase of e^'^ . Now, we may fuse a pair of anyons 
into a bound pair, which can be thought of as another species of anyon. Winding two of 
these clockwise around each other must induce a phase of e*^"^, because each anyon in each 
pair has wound around each anyon in the other pair. The non-Abelian case is analogous but 
more complicated. Such fusion rules and the condition that the theory be purely topological 
create constraints on which representations of the braid group can arise from braiding of 
anyons. A set of braiding rules and fusion rules satisfying the consistency constraints is 
called a topological quantum field theory (TQFT). Topological quantum field theories can 
also be formulated in the more traditional language of Lagrangians and path integrals. 
However, in this thesis I will not need to use the Lagrangian formulation of TQFTs. 

Topological quantum field theories have been well studied by both mathematicians and 
physicists. One remarkable result is that the complete set of consistency relations between 
braiding and fusion are completely captured by just two identities, known as the pentagon 
and hexagon identities [136J. Despite this progress, a full classification of topological quan- 
tum field theories is not known. However, several interesting and nontrivial examples of 
quantum field theories are known. Of particular interest for quantum computing is the 
TQFT whose particles are called Fibonacci anyons. A set of n Fibonoacci anyons lives in a 
degenerate eigenspace whose dimension is /n+i, the (n-t- 1)**^ Fibonacci number. Freedman 
et al. showed [74] that the representation of the braid group induced by the braiding of 
Fibonacci anyons is dense in SU{fn+i), and furthermore that quantum circuits on n qubits 
with poly(n) gates can be efficiently simulated by a braid on poly(?7-) Fibonacci anyons with 
poly(n) crossings. This is made possible by the fact that fn+i is exponential in n, and 
that the Fibonacci representation has some local structure onto which the tensor pruduct 
structure of quantum circuits can be efficiently mapped. (More detail is given in chapter 

El) 

The upshot of this correspondence between braids and quantum circuits is that one in 
principle can solve any problem in BQP by dragging Fibonacci anyons around each other. 
Conversely, it has also been shown that quantum circuits can simulate topological quantum 
field theories [73]. Thus topological quantum computing with Fibonacci anyons is equivalent 
to BQP. If non-Abelian anyons can be detected and manipulated, they may provide a useful 
medium for quantum computation. Topological quantum computations are believed to have 
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a high degree of inherent fault tolerance. As long as the anyons are kept well separated, small 
deviations in their trajectories will not change the topology of their braiding, and hence 
will not change the encoded quantum circuit. Furthermore, the degenerate eigenspace in 
which the anyons live is protected by an energy gap. Thus thermal transitions out of the 
space are unlikely. Thermal transisions between states within the degenerate space, while 
not protected against by an energy gap, are also unlikely because they can only be induced 
by nonlocal, topologocal operations. 

The topological model of quantum computation has also been useful in the development 
of new quantum algorithms. In 1989, Witten showed that the Jones polynomial (a powerful 
and important knot invariant) arises as a Wilson loop in a particular quantum field theory 
called Chern-Simons theory [T73]. The subsequent discovery by Freedman et al.fl3\ that 
quantum computers can simulate topological quantum field theories thus implicitly showed 
that quantum computers can efficiently approximate Jones polynomials. Furthermore, the 
discovery by Freedman et al. that topological quantum field theories can simulate quantum 
circuits implicitly showed that a certain problem of estimating Jones polynomials at the 
fifth root of unity is BQP-hard. As discussed in chapter [3l this has since led to a whole new 
class of exponential speedups by quantum computation for the approximation of various 
knot invariants and other polynomials. Furthermore, these speedups are very different from 
previously known exponential quantum speedups, most of which are in some way based on 
the hidden subgroups. 

1.6.3 Quantum Walks 

In a continuous time quantum walk, one chooses a graph with nodes that correspond to 
orthogonal states in a Hilbert space. The Hamiltonian is then chosen to be either the 
adjacency matrix or Laplacian of this graph. (For regular graphs these are equivalent up to 
an overall energy shift). The quantum walk is the unitary time evolution induced by this 
Hamiltonian. 

Continuous time quantum walks were introduced in[70]. They have been found to pro- 
vide an exponential speedup over classical computation for at least one oracular problem|44j. 
Discrete time quantum walks have also been formulated, and appear to be comparable in 
power to continuous time quantum walks. Quantum walks have now been used to find 
polynomial speedups for several natural oracular problems, as discussed in section [L3l No 
result exists in the literature answering the the question as to whether quantum walks are 
BQP-complete [i.e. universal). However, recent progress suggests that, to my surprise, 
quantum walks may in fact be BQP-complete [40j. 

Quantum walks are probably not useful in devising physical models of quantum com- 
putation. The most obvious approach is to lay out the nodes in space and couple them 
together along the edges. However, in a quantum walk, the nodes form the basis of the 
Hilbert space, which usually has exponentially high dimension. In quantum walk algo- 
rithms, the Hamiltonians is usually not /c-local for any fixed k. Nevertheless, as discussed 
in |43t [3 |45] , these Hamiltonians are efficiently simulable by quantum circuits since they 
are sparse and efficiently row-computable. 

1.6.4 One Clean Qubit 

In the one clean qubit model of quantum computation, one is given a single qubit in a pure 
state, and n qubits in the maximally mixed state. One then applies a polynomial size quan- 
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turn circuit to this intial state and afterwards performs a single-qubit measurement. The 
one clean qubit model was originally proposed as an idealization of quantum computation 
on highly mixed states, such as appear in NMR implementations \118\ 114 1 llSOj . 

It is not surprising that one clean qubit computers appear to be weaker than standard 
quantum computers. The amazing fact is that they can nevertheless solve certain problems 
for which no efficient classical algorithm is known. These problems include estimating 
the Pauli decomposition of the unitary matrix corresponding to a polynomial-size quantum 
circuilH, [11811158] . estimating quadratically signed weight enumerators |1 19 j . and estimating 
average fidelity decay of quantum maps p^Hll53j . and as shown in chapter [3l approximating 
certain Jones polynomials. 

The one clean qubit complexity class consists of the decision problems which can be 
solved in polynomial time by a one clean qubit machine with correctness probability of at 
least 2/3 by running a one clean qubit computer polynomially many times. In the original 
definition [118] of DQCl it is assumed that a classical computer generates the quantum 
circuits to be applied to the initial state p. By this definition DQCl automatically contains 
P. However, it is also interesting to consider a slightly weaker one clean qubit model, in 
which the classical computer controlling the quantum circuits has only the power of NCI. 
The resulting complexity class appears to have the interesting property that it it is not 
contained in P nor does P contain it. One clean qubit computers and DQCl are discussed 
in more detail in chapter [3l 



1.6.5 Measurement-based 

Amazingly, algorithm dependent unitary operations are not necessary for universal quantum 
computation. Building on previous work[841 1138^ I121j . Raussendorf and Briegel showed in 
[145| that one can perform universal quantum computation by performing a series of single- 
qubit projective measurements on a special entangled initial state. The initial state need 
not depend on the computation to be performed, other than its total size. The basis of a 
given single-qubit measurements depends on the quantum circuit to be simulated, and on 
the outcomes of the preceeding measurements. This dependence is efficiently computable 
classically. 

The measurement-based model is a promising candidate for the physical implementation 
of quantum computers. The measurement-based model has a fault tolerance threshold, 
which can be shown in a simple way by adapting the existing threshold theorem for the 
circuit modelj^. It seems unlikely that the measurement-based model will be useful for the 
design of algorithms, because of its very direct relationship to the circuit model. However, 
the class of initial states used in the measurement model, called graph states, have many 
interesting properties both physical and information theoretic. For example, they form the 
basis (literally as well as figuratively) of a broad class "nonadditive" quantum codes, which 
go beyond the stabilizer formalism [1 78 ^ [52]. (Graph states are stabilizer states. However, 
quantum error correcting codes can be obtained as the span of a set of graph states. In 
general such a span is not equal to the subspace stabilized by any subgroup of the Pauli 
group.) 



'This includes estimating the trace of the unitary as a special case. 
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1.6.6 Quantum Turing Machines 

Quantum Turing machines were first formulated in [57J and further studied in [23]. Quantum 
Turing machines are defined analogously to classical Turing machines except with a tape 
of qubits instead of a tape of bits, and with transition amplitudes instead of deterministic 
transition rules. Quantum Turing machines are usually somewhat cumbersome to work 
with, and have been replaced by quantum circuits for most applications. However, quantum 
Turing machines remain the only known model by which to define quantum Kolmogorov 
complexity. 

1.7 Outline of New Results 

In this thesis I present three main results relating to different models of quantum compu- 
tation. 

Recently, there has been growing interest in using adiabatic quantum computation as 
an architecture for experimentally realizable quantum computers. One of the reasons for 
this is the idea that the energy gap should provide some inherent resistance to noise. It is 
now known that universal quantum computation can be achieved adiabatically using 2-local 
Hamiltonians. The energy gap in these Hamiltonians scales as an inverse polynomial in the 
problem size. In chapter [2] I present stabilizer codes that can be used to produce a constant 
energy gap against 1-local and 2-local noise. The corresponding fault-tolerant universal 
Hamiltonians are 4-local and 6-local respectively, which is the optimal result achievable 
within this framework. I did this work in collaboration with Edward Far hi and Peter Shor. 

It is known that evaluating a certain approximation to the Jones polynomial for the plat 
closure of a braid is a BQP-complete problem. In chapter [3] I show that evaluating a certain 
additive approximation to the Jones polynomial at a fifth root of unity for the trace closure 
of a braid is a complete problem for the one clean qubit complexity class DQCl. That is, 
a one clean qubit computer can approximate these Jones polynomials in time polynomial 
in both the number of strands and number of crossings, and the problem of simulating a 
one clean qubit computer is reducible to approximating the Jones polynomial of the trace 
closure of a braid. I did this work in collaboration with Peter Shor. 

Adiabatic quantum algorithms are often most easily formulated using many-body in- 
teractions. However, experimentally available interactions are generally two-body. In 2004, 
Kempe, Kitaev, and Regev introduced perturbative gadgets, by which arbitrary three- 
body effective interactions can be obtained using Hamiltonians consisting only of two-body 
interactions |113j . These three-body effective interactions arise from the third order in per- 
turbation theory. Since their introduction, perturbative gadgets have become a standard 
tool in the theory of quantum computation. In chapter d] I construct generalized gadgets so 
that one can directly obtain arbitrary /c-body effective interactions from two-body Hamilto- 
nians using k^^ order in perturbation theory. I did this work in collaboration with Edward 
Far hi. 
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Chapter 2 



Fault Tolerance of Adiabatic 
Quantum Computers 

2.1 Introduction 

Recently, there has been growing interest in using adiabatic quantum computation as an 
architecture for experimentally realizable quantum computers. Aharonov et al.[8\, building 
on ideas by Feynman|71j and Kitaev [ll4j . showed that any quantum circuit can be simulated 
by an adiabatic quantum algorithm. The energy gap for this algorithm scales as an inverse 
polynomial in G, the number of gates in the original quantum circuit. G is identified as 
the running time of the original circuit. By the adiabatic theorem, the running time of 
the adiabatic simulation is polynomial in G. Because the slowdown is only polynomial, 
adiabatic quantum computation is a form of universal quantum computation. 

Most experimentally realizable Hamiltonians involve only few-body interactions. Thus 
theoretical models of quantum computation are usually restricted to involve interactions 
between at most some constant number of qubits k. Any Hamiltonian on n qubits can be 
expressed as a linear combination of terms, each of which is a tensor product of n Pauli 
matrices, where we include the 2x2 identity as a fourth Pauli matrix. If each of these tensor 
products contains at most k Pauli matrices not equal to the identity then the Hamiltonian 
is said to be A:-local. The Hamiltonian used in the universality construction of [8j is 3-local 
throughout the time evolution. Kempe et al. subsequently improved this to 2-local in jll3j . 

Schrodinger's equation shows that, for any constant (7, gH{gt) yields the same time 
evolution from time to T/^f that H{t) yields from to T. Thus, the running time of an 
adiabatic algorithm would not appear to be well defined. However, in any experimental 
realization there will be a limit to the magnitude of the fields and couplings. Thus it is 
reasonable to limit the norm of each term in H[t). Such a restriction enables one to make 
statements about how the running time of an adiabatic algorithm scales with some measure 
of the problem size, such as G. 

One of the reasons for interest in adiabatic quantum computation as an architecture 
is the idea that adiabatic quantum computers may have some inherent fault tolerance 
|41l 11551 El 1151^ I108j . Because the final state depends only on the final Hamiltonian, 
adiabatic quantum computation may be resistant to slowly varying control errors, which 
cause H{t) to vary from its intended path, as long as the final Hamiltonian is correct. An 
exception to this would occur if the modified path has an energy gap small enough to violate 
the adiabatic condition. Unfortunately, it is generally quite difficult to evaluate the energy 
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gap of arbitrary local Hamiltonians. 

Another reason to expect that adiabatic quantum computations may be inherently fault 
tolerant is that the energy gap should provide some inherent resistance to noise caused by 
stray couplings to the environment. Intuitively, the system will be unlikely to get excited 
out of its ground state if k}jT is less than the energy gap. Unfortunately, in most proposed 
applications of adiabatic quantum computation, the energy gap scales as an inverse poly- 
nomial in the problem size. Such a gap only affords protection if the temperature scales the 
same way. However, a temperature which shrinks polynomially with the problem size may 
be hard to achieve experimentally. 

To address this problem, we propose taking advantage of the possibility that the deco- 
herence will act independently on the qubits. The rate of decoherence should thus depend 
on the energy gap against local noise. We construct a class of stabilizer codes such that 
encoded Hamiltonians are guaranteed to have a constant energy gap against single-qubit 
excitations. These stabilizer codes are designed so that adiabatic quantum computation 
with 4-local Hamiltonians is universal for the encoded states. We illustrate the usefulness 
of these codes for reducing decoherence using a noise model, proposed in [3T], in which each 
qubit independently couples to a photon bath. 



2.2 Error Detecting Code 

To protect against decoherence we wish to create an energy gap against single-qubit distur- 
bances. To do this we use a quantum error correcting code such that applying a single Pauli 
operator to any qubit in a codeword will send this state outside of the codespace. Then we 
add an extra term to the Hamiltonian which gives an energy penalty to all states outside 
the codespace. Since we are only interested in creating an energy penalty for states outside 
the codespace, only the fact that an error has occurred needs to be detectable. Since we are 
not actively correcting errors, it is not necessary for distinct errors to be distinguishable. In 
this sense, our code is not truly an error correcting code but rather an error detecting code. 
Such passive error correction is similar in spirit to ideas suggested for the circuit model in 

m- 

It is straightforward to verify that the 4-qubit code 

|0l) = ^ (|0000) +«|0011) +i|1100) + |1111)) (2.1) 

|1l) = ^ (- |0101) +i|0110) +i|1001) - |1010)) (2.2) 
satisfies the error-detection requirements, namely 

(OlI o |0l) = (IlI a\lL) = (OlI a |1l) = (2.3) 

where a is any of the three Pauli operators acting on one qubit. Furthermore, the following 
2-local operations act as encoded Pauli X, Y, and Z operators. 

Xl = Y^I(g)Y(g)I 

Yl = -10X0X0/ (2.4) 
Zl = Z (g) Z (g) I (g) I 
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That is, 

XM = Xl\Il) = |0i), 

Yl\^l) = i\lL), Yl\Il) = -i|OL), 

ZM = |0l), Zl\Il) = 

An arbitrary state of a single qubit a |0) + /3 |1) is encoded as a |0l) + /3 

Starting with an arbitrary 2-local Hamiltonian H on N bits, we obtain a new fault tol- 
erant Hamiltonian on 4:N bits by the following procedure. An arbitrary 2-local Hamiltonian 
can be written as a sum of tensor products of pairs of Pauli matrices acting on different 
qubits. After writing out H in this way, make the following replacements 

/^/®4^ X^Xl, Y^Yl, Z^Zl 

to obtain a new 4-local Hamiltonian Hsl acting on 4:N qubits. The total fault tolerant 
Hamiltonian Hs is 

Hs = Hsl + Hsp (2.5) 

where Rgp is a sum of penalty terms, one acting on each encoded qubit, providing an 
energy penalty of at least Ep for going outside the code space. We use the subscript S to 
indicate that the Hamiltonian acts on the system, as opposed to the environment, which we 
introduce later. Note that Hsl and Hsp commute, and thus they share a set of simultaneous 
eigenstates. 

If the ground space of H is spanned by • • • then the ground space of Hs is 

spanned by the encoded states V'^^^ ■ • • V'i'^^y Furthermore, the penalty terms provide 
an energy gap against 1-local noise which does not shrink as the size of the computation 
grows. 

The code described by equations 12. II and l2. 21 can be obtained using the stabilizer formal- 
ism [82l[137]. In this formalism, a quantum code is not described by explicitly specifying a 
set of basis states for the code space. Rather, one specifies the generators of the stabilizer 
group for the codespace. Let G„ be the Pauli group on n qubits {i.e. the set of all tensor 
products of n Pauli operators with coefficients of ±1 or iti). The stabilizer group of a 
codespace C is the subgroup S of Gn such that x\il)) = for any x £ S and any {ip) £ C. 

A 2^ dimensional codespace over n bits can be specified by choosing n — k independent 
commuting generators for the stabilizer group S. By independent we mean that no generator 
can be expressed as a product of others. In our case we are encoding a single qubit using 4 
qubits, thus k = 1 and n = 4, and we need 3 independent commuting generators for S. 

To satisfy the orthogonality conditions, listed in equation 12.31 which are necessary for 
error detection, it suffices for each Pauli operator on a given qubit to anticommute with at 
least one of the generators of the stabilizer group. The generators 

g-^ = X (S> X (S> X ^ X 
g2 = Z®Z®Z®Z 

gs = X(^Y(^Z(^I (2.6) 

satisfy these conditions, and generate the stabilizer group for the code given in equations 
O and [221 

Adding one term of the form 

Hp = -Epigi + g2 + gs) (2.7) 
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to the encoded Hamiltonian for each encoded qubit yields an energy penalty of at least Ep 
for any state outside the codespace. 

2-local encoded operations are optimal. None of the encoded operations can be made 
1-local, because they would then have the same form as the errors we are trying to detect 
and penalize. Such an operation would not commute with all of the generators. 



2.3 Noise Model 

Intuitively, one expects that providing an energy gap against a Pauli operator applied to 
any qubit protects against 1-local noise. We illustrate this using a model of decoherence 
proposed in [41]. In this model, the quantum computer is a set of spin-1/2 particles weakly 
coupled to a large photon bath. The Hamiltonian for the combined system is 

H = Hs + He + XV, 

where Hs{t) is the adiabatic Hamiltonian that implements the algorithm by acting only on 
the spins. He is the Hamiltonian which acts only on the photon bath, and AV^ is a weak 
coupling between the spins and the photon bath. Specifically, V is assumed to take the 
form 

poo 

where are raising and lowering operators for the iih spin, is the annihilation operator 
for the photon mode with frequency w, and g{uj) is the spectral density. 
From this premise Childs et al. obtain the following master equation 



'^^=-t[Hs,p\-Y,MabSab{p) (2. 



where 



fjV a) 



i 

+ {Nab + l)\gab?{h\ a^l^\a) (ak« 



is a scalar. 



Sab{p) = \a) {a\p + p \a) {a\ - 2 |6) (a| p\a) {h\ 
is an operator, \a) is the instantaneous eigenstate of Hs with energy oja, 

exp [f){uJ}, - LOa)] - 1 

is the Bose-Einstein distribution at temperature 1//3, and 

/ >^9{^b - ^a), i^b > ^a, 1^ 

9ba = < , , ^ , , y^-^) 

[ 0, iOb < UJa- 

Suppose that we encode the original A^-qubit Hamiltonian as a 4A^-qubit Hamiltonian 
as described above. As stated in equation 12.51 the total spin Hamiltonian Hs on spins 
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consists of the encoded version Hsl of the original Hamiltonian Hs plus the penalty terms 
Hsp- 

Most adiabatic quantum computations use an initial Hamiltonian with an eigenvalue 
gap of order unity, independent of problem size. In such nearly pure initial state 

can be achieved at constant temperature. Therefore, we'll make the approximation that the 
spins start in the pure ground state of the initial Hamiltonian, which we'll denote |0). Then 
we can use equation 12.81 to examine dp/dt at i = 0. Since the initial state is p = |0) (0|, 
£ah{p) is zero unless \a) = |0). The master equation at t = is therefore 



dp 
dt 



[F5,/9]- J^Mofefofe(p). (2.10) 



Hsp is given by a sum of terms of the form 12.71 and it commutes with Hsl- Thus, Hs 
and Hsp share a complete set of simultaneous eigenstates. The eigenstates of Hs can thus 
be separated into those which are in the codespace C {i.e. the ground space of Hsp) and 
those which are in the orthogonal space C"*". The ground state |0) is in the codespace. Mob 
will be zero unless |6) S C"*-, because a± = {X ± iY)/2, and any Pauli operator applied to 
a single bit takes us from C to C"*". Equation 12.101 therefore becomes 



dp 
dt 



[Hs,p]+ MobSobip) (2.11) 



Since |0) is the ground state, ujb > ujq, thus equation 12.91 shows that the terms in Mob 
proportional to l^ofoP will vanish, leaving only 

Mob = (0| \b) {b\ |0) . 

i 

Now let's examine NbQ. 

uJb-uJo = {h\ {Hsl + Hsp) \b) - (0| {Hsl + Hsp) |0) . 
|0) is in the ground space of Hsl-, thus 

(6|/7sl|6)-(0|F5l|0) >0, 

and so 

u:b-oJo>{h\Hsp\h)-{Q\Hsp\Q). 

Since |6) G C7^ and |0) e C, 

{h\Hsp\b)-{Q\Hsp\^)=Ep, 

thus uJb — ojQ > Ep. 

A sufficiently large (3Ep will make N^a small enough that the term X^beC-i- ^^ob£{p) can 
be neglected from the master equation, leaving 



dp 
dt 



~ -i[Hs,p] 



which is just Schrodinger's equation with a Hamiltonian equal to Hs and no decoherence. 
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Note that the preceding derivation did not depend on the fact that cr^ are raising and 
lowering operators, but only on the fact that they act on a single qubit and can therefore 
be expressed as a linear combination of Pauli operators. 

Nijq is small but nonzero. Thus, after a sufficiently long time, the matrix elements 
of p involving states other than |0) will become non- negligible and the preceding picture 
will break down. How long the computation can be run before this happens depends on 
the magnitude of ^^g(7± Moi,£{p), which shrinks exponentially with Ep/T and grows only 
polynomially with the number of qubits N. Thus it should be sufficient for l/T to grow 
logarithmically with the problem size for the noise due to the terms present in equation 
12.81 to be supressed. In contrast, one expects that if the Hamiltonian had only an inverse 
polynomial gap against 1-local noise, the temperature would need to shrink polynomially 
rather than logarithmically. 

One should note that equation 12.81 is derived by truncating at second order in the 
coupling between the system and bath. Thus, at sufficiently long timescales, higher order 
couplings may become relevant. Physical 1-local terms can give rise to fc-local virtual terms 
at k!"^ order in perturbation theory, thus protecting against these may require an extension 
of the technique present here. Nevertheless, the technique presented here protects against 
the lowest order noise terms, which should be the largest ones provided that the coupling 
to the environment is weak. 

2.4 Higher Weight Errors 

Now that we know how to obtain a constant gap against 1-local noise, we may ask whether 
the same is possible for 2-local noise. To accomplish this we need to find a stabilizer group 
such that any pair of Pauli operators on two bits anticommutes with at least one of the 
generators. This is exactly the property satisfied by the standard |137j 5-qubit stabilizer 
code, whose stabilizer group is generated by 

= X®Z®Z®X®I 

5-3 = X®I®X®Z®Z 

54 = Z®X®I®X®Z. (2.12) 
The codewords for this code are 

|0l) = ^ [ lOOOOO) + llOOlO) + lOlOOl) llOlOO) 

+ 101010) - IllOll) - lOOllO) - IllOOO) 

- IlllOl) - lOOOll) - IllllO) - lOllll) 

- |10001) - lOllOO) - iioiii) + looioi) ] 

|1l) = ^ [ llllll) + loiioi) + iioiio) + loioii) 

+ 110101) - looloo) - iiiool) - loom) 

- lOOOlO) - IlllOO) - looool) - iioooo) 

- lOlllO) - llOOll) - lOlOOO) + IllOlO) ] . 
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The encoded Pauli operations for this code are conventionahy expressed as 

Xl = X®X®X®X®X 
Yl = Y (S)Y (S)Y (S)Y 
Zl = Z(g)Z(g)Z(g)Z(g)Z. 

However, multiplying these encoded operations by members of the stabilizer group doesn't 
affect their action on the codespace. Thus we obtain the following equivalent set of encoded 
operations. 

Xl = -X ^ I 0Y 0Y I 
Yl = -Z®Z®I®Y®I 

Zl = -Y(^Z(^Y(^I0I (2.13) 

These operators are all 3-local. This is the best that can be hoped for, because the code 
protects against 2-local operations and therefore any 2-local operation must anticommute 
with at least one of the generators. 

Besides increasing the locality of the encoded operations, one can seek to decrease the 
number of qubits used to construct the codewords. The quantum singleton bound [137] 
shows that the five qubit code is already optimal and cannot be improved in this respect. 

The distance d of a quantum code is the minimum number of qubits of a codeword which 
need to be modified before obtaining a nonzero inner product with a different codeword. 
For example, applying Xl, which is 3-local, to \0l) of the 5-qubit code converts it into 
but applying any 2-local operator to any of the codewords yields something outside 
the codespace. Thus the distance of the 5-qubit code is 3. Similarly the distance of our 
4-qubit code is 2. To detect t errors a code needs a distance of 1, and to correct t errors, 
it needs a distance of 2t -|- 1. 

The quantum singleton bound states that the distance of any quantum code which uses 
n qubits to encode k qubits will satisfy 

n-k>2{d-l). (2.14) 

To detect 2 errors, a code must have distance 3. A code which encodes a single qubit with 
distance 3 must use at least 5 qubits, by equation 12.141 Thus the 5-qubit code is optimal. 
To detect 1 error, a code must have distance 2. A code which encodes a single qubit with 
distance 2 must have at least 3 qubits, by equation 12.141 Thus it appears possible that our 
4-qubit code is not optimal. However, no 3-qubit stabilizer code can detect all single-qubit 
errors, which we show as follows. 

The stabilizer group for a 3-qubit code would have two independent generators, each 
being a tensor product of 3 Pauli operators. 

gi = (Ju O (712 O 0-13 
52 = 0-21 (8) Cr22 ® 0-23 

These must satisfy the following two conditions: (1) they commute, and (2) an X, Y, or 
Z on any of the three qubits anticommutes with at least one of the generators. This is 
impossible, because condition (2) requires an a^ii^l for each i = 1,2,3. In this case gi 
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and §2 anticommute. 

The stabilizer formalism describes most but not all currently known quantum error 
correcting codes. We do not know whether a 3-qubit code which detects all single-qubit 
errors while still maintaining 2-local encoded operations can be found by going outside the 
stabilizer formalism. It may also be interesting to investigate whether there exist compu- 
tationally universal 3-local or 2-local adiabatic Hamiltonians with a constant energy gap 
against local noise. 
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Chapter 3 



DQCl-completeness of Jones 
Polynomials 

3.1 Introduction 

It is known that evaluating a certain approximation to the Jones polynomial for the plat 
closure of a braid is a BQP-complete problem. That is, this problem exactly captures the 
power of the quantum circuit model [TU [U Hj. The one clean qubit model is a model of 
quantum computation in which all but one qubit starts in the maximally mixed state. One 
clean qubit computers are believed to be strictly weaker than standard quantum computers, 
but still capable of solving some classically intractable problems |118j . Here we show that 
evaluating a certain approximation to the Jones polynomial at a fifth root of unity for the 
trace closure of a braid is a complete problem for the one clean qubit complexity class. That 
is, a one clean qubit computer can approximate these Jones polynomials in time polynomial 
in both the number of strands and number of crossings, and the problem of simulating a 
one clean qubit computer is reducible to approximating the Jones polynomial of the trace 
closure of a braid. 

3.2 One Clean Qubit 

The one clean qubit model of quantum computation originated as an idealized model of 
quantum computation on highly mixed initial states, such as appear in NMR implementations \118\ 
114] . In this model, one is given an initial quantum state consisting of a single qubit in the 
pure state |0), and n qubits in the maximally mixed state. This is described by the density 
matrix 

P=|0) (0|55^. 

One can apply any polynomial-size quantum circuit to p, and then measure the first 
qubit in the computational basis. Thus, if the quantum circuit implements the unitary 
transformation U, the probability of measuring |0) will be 

po = Tr[(|0) (0| I)UpU^] = 2-"Tr[(|0) (0| I)Ui\0) (0| I)U^. (3.1) 

Computational complexity classes are typically described using decision problems, that 
is, problems which admit yes/no answers. This is mathematically convenient, and the 
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implications for the complexity of non-decision problems are usually straightforward to 
obtain (c/. [141j ). The one clean qubit complexity class consists of the decision problems 
which can be solved in polynomial time by a one clean qubit machine with correctness 
probability of at least 2/3. The experiment described in equation 13.11 can be repeated 
polynomially many times. Thus, if pi > 1/2 + e for instances to which the answer is yes, 
and pi < 1/2 — e otherwise, then by repeating the experiment poly(l/e) times and taking 
the majority vote one can achieve 2/3 probability of correctness. Thus, as long as e is at 
least an inverse polynomial in the problem size, the problem is contained in the one clean 
qubit complexity class. Following |118j . we will refer to this complexity class as DQCl. 

A number of equivalent definitions of the one clean qubit complexity class can be made. 
For example, changing the pure part of the initial state and the basis in which the final 
measurement is performed does not change the resulting complexity class. Less trivially, 
allowing logarithmically many clean qubits results in the same class, as discussed below. It 
is essential that on a given copy of p, measurements are performed only at the end of the 
computation. Otherwise, one could obtain a pure state by measuring p thus making all the 
qubits "clean" and re-obtaining BQP. Remarkably, it is not necessary to have even one fully 
polarized qubit to obtain the class DQCl. As shown in [118], a single partially polarized 
qubit suffices. 

In the original definition [118] of DQCl it is assumed that a classical computer generates 
the quantum circuits to be applied to the initial state p. By this definition DQCl automat- 
ically contains the complexity class P. However, it is also interesting to consider a slightly 
weaker one clean qubit model, in which the classical computer controlling the quantum 
circuits has only the power of NCI. The resulting complexity class appears to have the 
interesting property that it is incomparable to P. That is, it is not contained in P nor does 
P contain it. We suspect that our algorithm and hardness proof for the Jones polynomial 
carry over straightforwardly to this NCl-controlled one clean qubit model. However, we 
have not pursued this point. 

Any 2" x 2" unitary matrix can be decomposed as a linear combination of n-fold tensor 
products of Pauli matrices. As discussed in |118| . the problem of estimating a coefficient in 
the Pauli decomposition of a quantum circuit to polynomial accuracy is a DQCl-complete 
problem. Estimating the normalized trace of a quantum circuit is a special case of this, and 
it is also DQCl-complete. This point is discussed in p3H]. To make our presentation self- 
contained, we will sketch here a proof that trace estimation is DQCl-complete. Technically, 
we should consider the decision problem of determining whether the trace is greater than 
a given threshold. However, the trace estimation problem is easily reduced to its decision 
version by the method of binary search, so we will henceforth ignore this point. 

First we'll show that trace estimation is contained in DQCl. Suppose we are given a 
quantum circuit on n qubits which consists of polynomially many gates from some finite 
universal gate set. Given a state lip) of n qubits, there is a standard technique for estimating 
(^1 Ulip), called the Hadamard test[6J, as shown in figure [3^ Now suppose that we use 
the circuit from figure \3-i[ but choose \^) uniformly at random from the 2" computational 
basis states. Then the probability of getting outcome |0) for a given measurement will be 



Choosing {ip) uniformly at random from the 2" computational basis states is exactly the 
same as inputting the density matrix 1/2"' to this register. Thus, the only clean qubit is 




1 -hRe((x| C/|x)) _ 1 Re(Tr [/) 
2 ~ 2 ~^ 2"+^ 



a;6{0,l} 



n 
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(|0) + |1)) 



H — 



Figure 3-1: This circuit implements the Hadamard test. A horizontal line represents a 
qubit. A horizontal line with a slash through it represents a register of multiple qubits. 
The probability pq of measuring |0) is as shown above. Thus, one can obtain the real part 
of {iIj\U to precision e by making 0(l/e^) measurements and counting what fraction 
of the measurement outcomes are 1 0) . Similarly, if the control bit is instead initialized to 
-7=(|0) — i |1)), one can estimate the imaginary part of U lip). 



the control qubit. Trace estimation is therefore achieved in the one clean qubit model by 
converting the given circuit for U into a circuit for controlled-C/ and adding Hadamard 
gates on the control bit. One can convert a circuit for U into a circuit for controlled-C/ 
by replacing each gate G with a circuit for controlled-G. The overhead incurred is thus 
bounded by a constant factor [137j . 

Next we'll show that trace estimation is hard for DQCl. Suppose we are given a classical 
description of a quantum circuit implementing some unitary transformation U on n qubits. 
As shown in equation 13.11 the probability of obtaining outcome |0) from the one clean 
qubit computation of this circuit is proportional to the trace of the non-unitary operator 
(|0) (0| I)U{\0) (0| I)U\ which acts on n qubits. Estimating this can be achieved by 
estimating the trace of 



U' = 



-h 


c/t 






u 


















i 


^ — 



which is a unitary operator on n -|- 2 qubits. This suffices because 

TV[(|0) (0| I)U{\0) (0| /)C/t] = ^Tr[U']. (3.2) 
To see this, we can think in terms of the computational basis: 

Tt[U'] = ^' 1^) • 

xG{0,1}" 

If the first qubit of \x) is |1), then the rightmost CNOT in U' will flip the lowermost qubit. 
The resulting state will be orthogonal to \x) and the corresponding matrix element will not 
contribute to the trace. Thus this CNOT gate simulates the initial projector |0) (0| / in 
equation 13.21 Similarly, the other CNOT in U' simulates the other projector in equation 
[3:21 

The preceding analysis shows that, given a description of a quantum circuit implement- 
ing a unitary transformation U on n-qubits, the problem of approximating ^Tr C/ to within 
^ poiy(n) pi'ecision is DQCl-complete. 
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Figure 3-2: Here CNOT gates are used to simulate 3 clean ancilla qubits. 



Some unitaries may only be efficiently implementable using ancilla bits. That is, to 
implement U on n-qubits using a quantum circuit, it may be most efficient to construct 
a circuit on n + m qubits which acts as C/ /, provided that the m ancilla qubits are all 
initialized to |0). These ancilla qubits are used as work bits in intermediate steps of the 
computation. To estimate the trace of U, one can construct a circuit Ua on n + 2m qubits 
by adding CNOT gates controlled by the m ancilla qubits and acting on m extra qubits, as 
shown in figure [321 This simulates the presence of m clean ancilla qubits, because if any of 
the ancilla qubits is in the |1) state then the CNOT gate will flip the corresponding extra 
qubit, resulting in an orthogonal state which will not contribute to the trace. 

With one clean qubit, one can estimate the trace of Ua to a precision of poiy(„^m) • By 
construction, Tr[C/a] = 2™Tr[[/]. Thus, if m is logarithmic in n, then one can obtain Tr[C/] to 
precision just as can be obtained for circuits not requiring ancilla qubits. This line 

of reasoning also shows that the fe-clean qubit model gives rise to the same complexity class 
as the one clean qubit model, for any constant A;, and even for k growing logarithmically 
with n. 

It seems unlikely that the trace of these exponentially large unitary matrices can be 
estimated to this precision on a classical computer in polynomial time. Thus it seems 
unlikely that DQCl is contained in P. (For more detailed analysis of this point see [54].) 
However, it also seems unlikely that DQCl contains all of BQP. In other words, one clean 
qubit computers seem to provide exponential speedup over classical computation for some 
problems despite being strictly weaker than standard quantum computers. 

3.3 Jones Polynomials 

A knot is defined to be an embedding of the circle in considered up to continuous 
transformation (isotopy). More generally, a link is an embedding of one or more circles in 
M'^ up to isotopy. In an oriented knot or link, one of the two possible traversal directions is 
chosen for each circle. Some examples of knots and links are shown in figure [331 One of the 
fundamental tasks in knot theory is, given two representations of knots, which may appear 
superficially different, determine whether these both represent the same knot. In other 
words, determine whether one knot can be deformed into the other without ever cutting 
the strand. 

Reidemeister showed in 1927 that two knots are the same if and only if one can be 
deformed into the other by some sequence constructed from three elementary moves, known 
as the Reidemeister moves, shown in figure [331 This reduces the problem of distinguishing 
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Figure 3-3: Shown from left to right are the unknot, another representation of the unknot, 
an oriented trefoil knot, and the Hopf link. Broken lines indicate under crossings. 




Figure 3-4: Two knots are the same if and only if one can be deformed into the other by 
some sequence of the three Reidemeister moves shown above. 



knots to a combinatorial problem, although one for which no efficient solution is known. In 
some cases, the sequence of Reidemeister moves needed to show equivalence of two knots 
involves intermediate steps that increase the number of crossings. Thus, it is very difficult 
to show upper bounds on the number of moves necessary. The most thoroughly studied 
knot equivalence problem is the problem of deciding whether a given knot is equivalent to 
the unknot. Even showing the decidability of this problem is highly nontrivial. This was 
achieved by Haken in 1961 [87J. In 1998 it was shown by Hass, Lagarias, and Pippenger that 
the problem of recognizing the unknot is contained in NP[93]. 

A knot invariant is any function on knots which is invariant under the Reidemeister 
moves. Thus, a knot invariant always takes the same value for different representations 
of the same knot, such as the two representations of the unknot shown in figure 13-31 In 
general, there can be distinct knots which a knot invariant fails to distinguish. 

One of the best known knot invariants is the Jones polynomial, discovered in 1985 by 
Vaughan Jones [T06] . To any oriented knot or link, it associates a Laurent polynomial in 
the variable t^/"^. The Jones polynomial has a degree in t which grows at most linearly 
with the number of crossings in the link. The coefficients are all integers, but they may be 
exponentially large. Exact evaluation of Jones polynomials at all but a few special values 
of t is #P-hard |101j . The Jones polynomial can be defined recursively by a simple "skein" 
relation. However, for our purposes it will be more convenient to use a definition in terms 
of a representation of the braid group, as discussed below. 




Figure 3-5: Shown from left to right are a braid, its plat closure, and its trace closure. 
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To describe in more detail the computation of Jones polynomials we must specify how 
the knot will be represented on the computer. Although an embedding of a circle in is a 
continuous object, all the topologically relevant information about a knot can be described 
in the discrete language of the braid group. Links can be constructed from braids by joining 
the free ends. Two ways of doing this are taking the plat closure and the trace closure, as 
shown in figure 13-51 Alexander's theorem states that any link can be constructed as the 
trace closure of some braid. Any link can also be constructed as the plat closure of some 
braid. This can be easily proven as a corollary to Alexander's theorem, as shown in figure 




Figure 3-6: A trace closure of a braid on n strands can be converted to a plat closure of a 
braid on 2n strands by moving the "return" strands into the braid. 
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Figure 3-7: Shown are the two Markov moves. Here the boxes represent arbitrary braids. 
If a function on braids is invariant under these two moves, then the corresponding function 
on links induced by the trace closure is a link invariant. 



Given that the trace closure provides a correspondence between links and braids, one 
may attempt to find functions on braids which yield link invariants via this correspondence. 
Markov's theorem shows that a function on braids will yield a knot invariant provided 
it is invariant under the two Markov moves, shown in figure [3^ Thus the Markov moves 
provide an analogue for braids of the Reidemeister moves on links. The constraints imposed 
by invariance under the Reidemeister moves are enforced in the braid picture jointly by 
invariance under Markov moves and by the defining relations of the braid group. 

A linear function / satisfying f{AB) = f{BA) is called a trace. The ordinary trace on 
matrices is one such function. Taking a trace of a representation of the braid group yields a 
function on braids which is invariant under Markov move I. If the trace and representation 
are such that the resulting function is also invariant under Markov move II, then a link 
invariant will result. The Jones polynomial can be obtained in this way. 

In [6], Aharonov, et al. show that an additive approximation to the Jones polynomial of 
the plat or trace closure of a braid at t = ^^'^'^1^ can be computed on a quantum computer 
in time which scales polynomially in the number of strands and crossings in the braid and 
in k. In [U I174j . it is shown that for plat closures, this problem is BQP-complete. The 
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complexity of approximating the Jones polynomial for trace closures was left open, other 
than showing that it is contained in BQP. 

The results of [SI HI I174j reformulate and generalize the previous results of Freedman 
et al. [741 I73j . which show that certain approximations of Jones polynomials are BQP- 
complete. The work of Freedman et al. in turn builds upon Witten's discovery of a con- 
nection between Jones polynomials and topological quantum field theory |173j . Recently, 
Aharonov et al. have generalized further, obtaining an efficient quantum algorithm for ap- 
proximating the Tutte polynomial for any planar graph, at any point in the complex plane, 
and also showing BQP-hardness at some points [5|. As special cases, the Tutte polynomial 
includes the Jones polynomial, other knot invariants such as the HOMFLY polynomial, and 
partition functions for some physical models such as the Potts model. 

The algorithm of Aharonov et al. works by obtaining the Jones polynomial as a trace of 
the path model representation of the braid group. The path model representation is unitary 
for t = e^'^'^l^ and, as shown in [6], can be efficiently implemented by quantum circuits. For 
computing the trace closure of a braid the necessary trace is similar to the ordinary matrix 
trace except that only a subset of the diagonal elements of the unitary implemented by 
the quantum circuit are summed, and there is an additional weighting factor. For the plat 
closure of a braid the computation instead reduces to evaluating a particular matrix element 
of the quantum circuit. Aharonov et al. also use the path model representation in their 
proof of BQP-completeness. 

Given a braid 6, we know that the problem of approximating the Jones polynomial of 
its plat closure is BQP-hard. By Alexander's theorem, one can obtain a braid whose 
trace closure is the same link as the plat closure of h. The Jones polynomial depends only 
on the link, and not on the braid it was derived from. Thus, one may ask why this doesn't 
immediately imply that estimating the Jones polynomial of the trace closure is a BQP-hard 
problem. The answer lies in the degree of approximation. As discussed in section 13. 7^ 
the BQP-complete problem for plat closures is to approximate the Jones polynomial to a 
certain precision which depends exponentially on the number of strands in the braid. The 
number of strands in 6' can be larger than the number of strands in 6, hence the degree of 
approximation obtained after applying Alexander's theorem may be too poor to solve the 
original BQP-hard problem. 

The fact that computing the Jones polynomial of the trace closure of a braid can be 
reduced to estimating a generalized trace of a unitary operator and the fact that trace 
estimation is DQCl-complete suggest a connection between Jones polynomials and the one 
clean qubit model. Here we find such a connection by showing that evaluating a certain 
approximation to the Jones polynomial of the trace closure of a braid at a fifth root of 
unity is DQCl-complete. The main technical difficulty is obtaining the Jones polynomial 
as a trace over the entire Hilbert space rather than as a summation of some subset of the 
diagonal matrix elements. To do this we will not use the path model representation of the 
braid group, but rather the Fibonacci representation, as described in the next section. 

3.4 Fibonacci Representation 

The Fibonacci representation p'^ of the braid group -B„ is described in [111] in the context of 
Temperley-Lieb recoupling theory. Temperley-Lieb recoupling theory describes two species 
of idealized "particles" denoted by p and *. We will not delve into the conceptual and 
mathematical underpinnings of Temperley-Lieb recoupling theory. For present purposes, 
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Figure 3-8: For an n-strand braid we can write a length n + 1 string of p and * symbols 
across the base. The string may have no two * symbols in a row, but can be otherwise 
arbitrary. 
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Figure 3-9: cjj denotes the elementary crossing of strands i and z + 1. The braid group on n 
strands i?„ is generated by ui . . . cj„_i, which satisfy the relations aiaj = ajai for \i — j\ > 1 
and o"j+i(TjCrj-|-i = crjCTj+icjj for all i. The group operation corresponds to concatenation of 
braids. 



it will be sufficient to regard it as a formal procedure for obtaining a particular unitary 
representation of the braid group whose trace yields the Jones polynomial at t = e*^'^/^. 
Throughout most of this paper it will be more convenient to express the Jones polynomial 
in terms of ^ = e~*^'^/^, with t defined by t = A~^. 

It is worth noting that the Fibonacci representation is a special case of the path model 
representation used in [6j. The path model representation applies when t = e*^'^/'^' for any 
integer k, whereas the Fibonacci representation is for k = 5. The relationship between these 
two representations is briefly discussed in section 13.101 However, for the sake of making 
our discussion self contained, we will derive all of our results directly within the Fibonacci 
representation. 

Given an n-strand braid b G Bn, we can write a length n + 1 string of p and * symbols 
across the base as shown in figure 13-81 These strings have the restriction that no two * 
symbols can be adjacent. The number of such strings is /n+3, where fn is the n^^ Fibonacci 
number, defined so that /i = 1, /2 = 1, /s = 2, . . . Thus the formal linear combinations 
of such strings form an /„+3-dimensional vector space. For each n, the Fibonacci repre- 

(n) 

sentation pp is a homomorphism from to the group of unitary linear transformations 
on this space. We will describe the Fibonacci representation in terms of its action on the 
elementary crossings which generate the braid group, as shown in figure [3^ 

The elementary crossings correspond to linear operations which mix only those strings 
which differ by the symbol beneath the crossing. The linear transformations have a local 
structure, so that the coefficients for the symbol beneath the crossing to be changed or 
unchanged depend only on that symbol and its two neighbors. For example, using the 
notation of 

X - I I - I I 

P * p P * p p p p (3.3) 
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which means that the elementary crossing cjj corresponds to a hnear transformation which 
takes any string whose i*^ through (i + 2)*^^ symbols are p*p to the coefficient c times the 
same string plus the coefficient d times the same string with the * at the (i + 1)**^ position 
replaced by p. (As shown in figure 9, the i crossing is over the [i + 1)*^ symbol.) To 
compute the linear transformation that the representation of a given braid applies to a 
given string of symbols, one can write the symbols across the base of the braid, and then 
apply rules of the form 13.31 until all the crossings are removed, and all that remains are 
various coefficients for different strings to be written across the base of a set of straight 
strands. 

For compactness, we will use {p*p) = c{p *p) + d{ppp) as a shorthand for equation 13.31 
In this notation, the complete set of rules is as follows. 



where 



{*pp) 

{pip) 
(pp*) 
(ppp) 



a 
b 
c 
d 
e 
A 

T 



a{*pp) 
b{*p*) 

c{p * p) + d{ppp) 
a{pp*) 

d{p *p) + e{ppp), 



A^ 

A\^ - A^T 

^8^3/2^^4^3/2 
A\ - A\^ 

2/(1 + ^5). 



(3.4) 



(3.5) 



Using these rules we can calculate any matrix from the Fibonacci representation of 
the braid group. Notice that this is a reducible representation. These rules do not allow 
the rightmost symbol or leftmost symbol of the string to change. Thus the vector space 
decomposes into four invariant subspaces, namely the subspace spanned by strings which 
begin and end with p, and the *...*, p ... *, and * . . . p subspaces. As an example, we can 
use the above rules to compute the action of Bj, on the * . . . p subspace. 



b 




*p*p 
*PPP 



C 

d 



d 

e 



*p*p 
*PPP 



(3.6) 



In section [3^8] we prove that the Jones polynomial evaluated at t = e^'^'"!^ can be obtained 
as a weighted trace of the Fibonacci representation over the * ... * and * . . . p subspaces. 



3.5 Computing the Jones Polynomial in DQCl 

As mentioned previously, the Fibonacci representation acts on the vector space of formal 
linear combinations of strings of p and * symbols in which no two * symbols are adjacent. 
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The set of length n strings of this type, Pn, has fn+2 elements, where is the n Fibonacci 
number: /i = 1, /2 = 1, /s = 2, and so on. As shown in section [3.121 one can construct 
a bijective correspondence between these strings and the integers from to fn+2 — 1 as 
follows. If we think of * as 1 and p as 0, then with a string SnSn-i . . . si we associate the 
integer 

n 

z{s) =^Sifi+i. (3.7) 
1=1 

This is known as the Zeckendorf representation. 

Representing integers as bitstrings by the usual method of place value, we thus have a 
correspondence between the elements of P„ and the bitstrings of length b = [log2(/n+2)l • 
This correspondence will be a key element in computing the Jones polynomial with a one 
clean qubit machine. Using a one clean qubit machine, one can compute the trace of a 
unitary over the entire Hilbert space of 2" bitstrings. Using CNOT gates as above, one 
can also compute with polynomial overhead the trace over a subspace whose dimension is 
a polynomially large fraction of the dimension of the entire Hilbert space. However, it is 
probably not possible in general for a one clean qubit computer to compute the trace over 
subspaces whose dimension is an exponentially small fraction of the dimension of the total 
Hilbert space. For this reason, directly mapping the strings of p and * symbols to strings 
of 1 and will not work. In contrast, the correspondence described in equation 13.71 maps 
Pn to a subspace whose dimension is at least half the dimension of the full 2''-dimensional 
Hilbert space. 

In outline, the DQCl algorithm for computing the Jones polynomial works as follows. 
Using the results described in section [321 we will think of the quantum circuit as acting on 
b maximally mixed qubits plus 0(1) clean qubits. Thinking in terms of the computational 
basis, we can say that the first b qubits are in a uniform probabilistic mixture of the 2^* 
classical bitstring states. By equation 13. 7^ most of these bitstrings correspond to elements 
of Pn- In the Fibonacci representation, an elementary crossing on strands i and i — 1 
corresponds to a linear transformation which can only change the value of the i^^ symbol in 
the string of p's and *'s. The coefficients for changing this symbol or leaving it fixed depend 
only on the two neighboring symbols. Thus, to simulate this linear transformation, we will 
use a quantum circuit which extracts the (i — 1)*^, i^^, and (i + 1)*^ symbols from their 
bitstring encoding, writes them into an ancilla register while erasing them from the bitstring 
encoding, performs the unitary transformation prescribed by equation 13.41 on the ancillas, 
and then transfers this symbol back into the bitstring encoding while erasing it from the 
ancilla register. Constructing one such circuit for each crossing, multiplying them together, 
and performing DQCl trace-estimation yields an approximation to the Jones polynomial. 

Performing the linear transformation demanded by equation 13.41 on the ancilla register 
can be done easily by invoking gate set universality (c/. Solovay-Kitaev theorem jl37]) since 
it is just a three-qubit unitary operation. The harder steps are transferring the symbol values 
from the bitstring encoding to the ancilla register and back. 

It may be difficult to extract an arbitrary symbol from the bitstring encoding. However, 
it is relatively easy to extract the leftmost "most significant" symbol, which determines 
whether the Fibonacci number is present in the sum shown in equation 13.71 This is 
because, for a string s of length n, z{s) > fn-i if and only if the leftmost symbol is *. Thus, 
starting with a clean |0) ancilla qubit, one can transfer the value of the leftmost symbol 
into the ancilla as follows. First, check whether z{s) (as represented by a bitstring using 
place value) is > fn-i- If so flip the ancilla qubit. Then, conditioned on the value of the 
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ancilla qubit, subtract fn~i from the bitstring. (The subtraction will be done modulo 2^ 
for reversibility.) 

Any classical circuit can be made reversible with only constant overhead. It thus corre- 
sponds to a unitary matrix which permutes the computational basis. This is the standard 
way of implementing classical algorithms on a quantum computer |137| . However, the re- 
sulting reversible circuit may require clean ancilla qubits as work space in order to be 
implemented efficiently. For a reversible circuit to be implementable on a one clean qubit 
computer, it must be efficiently implementable using at most logarithmically many clean 
ancillas. Fortunately, the basic operations of arithmetic and comparison for integers can all 
be done classically by NCI circuits |171j . NCI is the complexity class for problems solvable 
by classical circuits of logarithmic depth. As shown in [H], any classical NCI circuit can be 
converted into a reversible circuit using only three clean ancillas. This is a consequence of 
Barrington's theorem. Thus, the process described above for extracting the leftmost symbol 
can be done efficiently in DQCl. 

More specifically, Krapchenko's algorithm for adding two n-bit numbers has depth 
[logn] -|- 0{\/Togn) [171j . A lower bound of depth logn is also known, so this is essen- 
tially optimal |171j . Barrington's construction [20] yields a sequence of 2^"^ gates on 3 clean 
ancilla qubits |14j to simulate a circuit of depth d. Thus we obtain an addition circuit which 
has quadratic size (up to a subpolynomial factor). Subtraction can be obtained analogously, 
and one can determine whether a > b can be done by subtracting a from b and looking at 
whether the result is negative. 

Although the construction based on Barrington's theorem has polynomial overhead and 
is thus sufficient for our purposes, it seems worth noting that it is possible to achieve better 
efficiency. As shown by Draper [60!, there exist ancilla- free quantum circuits for performing 
addition and subtraction, which succeed with high probability and have nearly linear size. 
Specifically, one can add or subtract a hardcoded number a into an n-qubit register |x) mod- 
ulo 2" by performing quantum Fourier transform, followed by 0{'n?) controlled-rotations, 
followed by an inverse quantum Fourier transform. Furthermore, using approximate quan- 
tum Fourier transforms [5 1 1 [T9] . [60| describes an approximate version of the circuit, which, 
for any value of parameter m, uses a total of only O(mnlogn) gate£]to produce an output 
having an inner product with |x -|- a mod 2") of 1 — 0(2"*"). 

Because they operate modulo 2", Draper's quantum circuits for addition and subtrac- 
tion do not immediately yield fast ancilla-free quantum circuits for comparison, unlike the 
classical case. Instead, start with an n-bit number x and then introduce a single clean an- 
cilla qubit initialized to |0). Then subtract an n-bit hardcoded number a from this register 
modulo 2"+^ . If a > X then the result will wrap around into the range [2"' , 2"+^ — 1] , in which 
case the leading bit will be 1. If a < x then the result will be in the range [0, 2" — 1]. After 
copying the result of this leading qubit and uncomputing the subtraction, the comparison is 
complete. Alternatively, one could use the linear size quantum comparison circuit devised 
by Takahashi and Kunihiro, which uses n uninitialized ancillas but no clean ancillas [163]. 

Unfortunately, most crossings in a given braid will not be acting on the leftmost strand. 
However, we can reduce the problem of extracting a general symbol to the problem of 
extracting the leftmost symbol. Rather than using equation 13.71 to make a correspondence 
between a string from P„ and a single integer, we can split the string at some chosen point, 
and use equation 13.71 on each piece to make a correspondence between elements of P„ and 

linear-size quantum circuit for exact ancilla-free addition is known, but it does not generalize easily 
to the case of hardcoded summands (55] • 
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Figure 3-10: Here we make a correspondence between strings of p and * symbols and ordered 
pairs of integers. The string of 9 symbols is split into substrings of length 4 and 5, and each 
one is used to compute an integer by adding the (i + 1)*^ Fibonacci number if * appears in 
the i^^ place. Note the two strings are read in different directions. 



ordered pairs of integers, as shown in figure [3^101 To extract the i*^ symbol, we thus convert 
encoding 13. 71 to the encoding where the string is split between the i*^ and {i — l)*'^ symbols, 
so that one only needs to extract the leftmost symbol of the second string. Like equation 
13.71 this is also an efficient encoding, in which the encoded bitstrings form a large fraction 
of all possible bitstrings. 

To convert encoding 13.71 to a split encoding with the split at an arbitrary point, we can 
move the split rightward by one symbol at a time. To introduce a split between the leftmost 
and second-to-leftmost symbols, one must extract the leftmost symbol as described above. 
To move the split one symbol to the right, one must extract the leftmost symbol from the 
right string, and if it is * then add the corresponding Fibonacci number to the left string. 
This is again a procedure of addition, subtraction, and comparison of integers. Note that the 
computation of Fibonacci numbers in NCI is not necessary, as these can be hardcoded into 
the circuits. Moving the split back to the left works analogously. As crossings of different 
pairs of strands are being simulated, the split is moved to the place that it is needed. At the 
end it is moved all the way leftward and eliminated, leaving a superposition of bitstrings 
in the original encoding, which have the correct coefficients determined by the Fibonacci 
representation of the given braid. 

Lastly, we must consider the weighting in the trace, as described by equation 13.101 
Instead of weight Wg, we will use Ws/fp so that the possible weights are 1 and both of 
which are < 1. We can impose any weight < 1 by doing a controlled rotation on an extra 
qubit. The CNOT trick for simulating a clean qubit which was described in section [3^2] can 
be viewed as a special case of this. All strings in which that qubit takes the value |1) have 
weight zero, as imposed by a 7r/2 rotation on the extra qubit. Because none of the weights 
are smaller than 1/0, the weighting will cause only a constant overhead in the number of 
measurements needed to get a given precision. 

3.6 DQCl-hardness of Jones Polynomials 

We will prove DQCl-hardness of the problem of estimating the Jones polynomial of the trace 
closure of a braid by a reduction from the problem of estimating the trace of a quantum 
circuit. To do this, we will specify an encoding, that is, a map rj : Qn — > Sm from the set 
Qn of strings of p and * symbols which start with * and have no two * symbols in a row, to 
Sm, the set of bitstrings of length m. For a given quantum circuit, we will construct a braid 
whose Fibonacci representation implements the corresponding unitary transformation on 
the encoded bits. The Jones polynomial of the trace closure of this braid, which is the trace 
of this representation, will equal the trace of the encoded quantum circuit. 

Unlike in section 13.51 we will not use a one to one encoding between bit strings and 
strings of p and * symbols. All we require is that a sum over all strings of p and * symbols 
corresponds to a sum over bitstrings in which each bitstring appears an equal number of 
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times. Equivalently, all bitstrings b G Sm must have a preimage rj~^(b) of the same size. 
This insures an unbiased trace in which no bitstrings are overweighted. To achieve this we 
can divide the symbol string into blocks of three symbols and use the encoding 



The strings other than ppp and p*p do not correspond to any bit value. Since both 
the encoded 1 and the encoded begin and end with p, they can be preceded and followed 
by any allowable string. Thus, changing an encoded 1 to an encoded zero does not change 
the number of allowed strings of p and * consistent with that encoded bitstring. Thus the 
condition that |?7~^(^)| be independent of b is satisfied. 

We would also like to know a priori where in the string of p and * symbols a given bit 
is encoded. This way, when we need to simulate a gate acting on a given bit, we would 
know which strands the corresponding braid should act on. If we were to simply divide our 
string of symbols into blocks of three and write down the corresponding bit string (skipping 
every block which is not in one of the two coding states ppp and p*p) then this would 
not be the case. Thus, to encode n bits, we will instead divide the string of symbols into 
n superblocks, each consisting of clogn blocks of three for some constant c. To decode a 
superblock, scan it from left to right until you reach either a ppp block or a p*p block. 
The first such block encountered determines whether the superblock encodes a 1 or a 0, 
according to equation 13.81 Now imagine we choose a string randomly from Qscniogn- By 
choosing the constant prefactor c in our superblock size we can ensure that in the entire 
string of Scnlogn symbols, the probability of there being any noncoding superblock which 
contains neither a ppp block nor a p*p block is polynomially small. If this is the case, 
then these noncoding strings will contribute only a polynomially small additive error to the 
estimate of the circuit trace, on par with the other sources of error. 

The gate set consisting of the CNOT, Hadamard, and vr/S gates is known to be universal 
for BQP [137j . Thus, it suffices to consider the simulation of 1-qubit and 2-qubit gates. 
Furthermore, it is sufficient to imagine the qubits arranged on a line and to allow 2-qubit 
gates to act only on neighboring qubits. This is because qubits can always be brought 
into neighboring positions by applying a series of SWAP gates to nearest neighbors. By 
our encoding a unitary gate applied to qubits i and i + 1 will correspond to a unitary 
transformation on symbols i3c log n through (i + 2)3c log n — 1. The essence of our reduction 
is to take each quantum gate and represent it by a corresponding braid on logarithmically 
many symbols whose Fibonacci representation performs that gate on the encoded qubits. 

Let's first consider the problem of simulating a gate on the first pair of qubits, which are 
encoded in the leftmost two superblocks of the symbol string. We'll subsequently consider 
the more difficult case of operating on an arbitrary pair of neighboring encoded qubits. As 
mentioned in section 13. 4^ the Fibonacci representation is reducible. Let pil^ denote 
the representation of the braid group i?„ defined by the action of p^p^ on the vector space 
spanned by strings which begin and end with *. As shown in section [3^ p*l\Bn) taken 
modulo phase is a dense subgroup of SU{fn^i), and pi^{Bn) modulo phase is a dense 
subgroup SU{fn)■ 
ln addition to being dense, the ** and *p blocks of the Fibonacci representation can 
be controlled independently. This is a consequence of the decoupling lemma, as discussed 
in section \3M Thus, given a string of symbols beginning with *, and any desired pair of 
unitaries on the corresponding *p and ** vector spaces, a braid can be constructed whose 
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Fibonacci representation approximates these unitaries to any desired level of precision. 
However, the number of crossings necessary may in general be large. The space spanned 
by strings of logarithmically many symbols has only polynomial dimension. Thus, one 
might guess that the braid needed to approximate a given pair of unitaries on the *p and ** 
vector spaces for logarithmically many symbols will have only polynomially many crossings. 
It turns out that this guess is correct, as we state formally below. 

Proposition 1. Given any pair of elements U^^p G SU{fk-\-i) and U^^ G SU{fk), and 
any real parameter e, one can in polynomial time find a braid b & with poly(n, log(l/e)) 
crossings whose Fibonacci representation satisfies \\p*p{b) — U.tp\\ < e and — f/** || < e, 

provided that k = O(logn). By symmetry, the same holds when considering pp^ rather than 
P*p ■ 

Note that proposition [1] is a property of the Fibonacci representation, not a generic 
consequence of density, since it is in principle possible for the images of group generators 
in a dense representation to lie exponentially close to some subgroup of the corresponding 
unitary group. We prove this proposition in section 13. Ill 

With proposition [T] in hand, it is apparent that any unitary gate on the first two encoded 
bits can be efficiently performed. To similarly simulate gates on arbitrary pairs of neigh- 
boring encoded qubits, we will need some way to unitarily bring a * symbol to a known 
location within logarithmic distance of the relevant encoded qubits. This way, we ensure 
that we are acting in the *p or ** subspaces. 

To move * symbols to known locations we'll use an "inchworm" structure which brings 
a pair of * symbols rightward to where they are needed. Specifically, suppose we have a 
pair of superblocks which each have a * in their exact center. The presence of the left * and 
the density of p^:p allow us to use proposition [1] to unitarily move the right * one superblock 
to the right by adding polynomially many crossings to the braid. Then, the presence of 
the right * and the density of pp=K allow us to similarly move the left * one superblock to 
the right, thus bringing it into the superblock adjacent to the one which contains the right 
*. This is illustrated in figure \3-iT[ To move the inchworm to the left we use the inverse 
operation. 

To simulate a given gate, one first uses the previously described procedure to make the 
inchworm crawl to the superblocks just to the left of the superblocks which encode the 
qubits on which the gate acts. Then, by the density of p*p and proposition [H the desired 
gate can be simulated using polynomially many braid crossings. 

To get this process started, the leftmost two superblocks must each contain a * at their 
center. This occurs with constant probability. The strings in which this is not the case 
can be prevented from contributing to the trace by a technique analogous to that used in 
section [3^2] to simulate logarithmically many clean ancillas. Namely, an extra encoded qubit 
can be conditionally flipped if the first two superblocks do not both have * symbols at their 
center. This can always be done using proposition [H since the leftmost symbol in the string 
is always *, and the p^p and p^:^ representations are both dense. 

It remains to specify the exact unitary operations which move the inchworm. Suppose 
we have a current superblock and a target superblock. The current superblock contains a * 
in its center, and the target superblock is the next superblock to the right or left. We wish 
to move the * to the center of the target superblock. To do this, we can select the smallest 
segment around the center such that in each of these superblocks, the segment is bordered 
on its left and right by p symbols. This segment can then be swapped, as shown in figure 
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Figure 3-11: This sequence of unitary steps is used to bring a * symbol where it is needed 
in the symbol string to ensure density of the braid group representation. The presence of 
the left * ensures density to allow the movement of the right * by proposition [TJ Similarly, 
the presence of the right * allows the left * to be moved. 



swap 




>itppp(>it)p*p* *pp>it(pj)*ppp 



Figure 3-12: This unitary procedure starts with a * in the current superblock and brings it 
to the center of the target superblock. 
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For some possible strings this procedure will not be well defined. Specifically there may 
not be any segment which contains the center and which is bordered by p symbols in both 
superblocks. On such strings we define the operation to act as the identity. For random 
strings, the probability of this decreases exponentially with the superblock size. Thus, by 
choosing c sufficiently large we can make this negligible for the entire computation. 

As the inchworm moves rightward, it leaves behind a trail. Due to the swapping, the 
superblocks are not in their original state after the inchworm has passed. However, because 
the operations are unitary, when the inchworm moves back to the left, the modifications 
to the superblocks get undone. Thus the inchworm can shuttle back and forth, moving 
where it is needed to simulate each gate, always stopping just to the left of the superblocks 
corresponding to the encoded qubits. 

The only remaining detail to consider is that the trace appearing in the Jones polynomial 
is weighted depending on whether the last symbol is p or *, whereas the DQCl-complete 
trace estimation problem is for completely unweighted traces. This problem is easily solved. 
Just introduce a single extra superblock at the end of the string. After bringing the inch- 
worm adjacent to the last superblock, apply a unitary which performs a conditional rotation 
on the qubit encoded by this superblock. The rotation will be by an angle so that the inner 
product of the rotated qubit with its original state is l/cp where (p is the golden ratio. This 
will be done only if the last symbol is p. This exactly cancels out the weighting which 
appears in the formula for the Jones polynomial, as described in section 13.81 

Thus, for appropriate e, approximating the Jones polynomial of the trace closure of a 
braid to within ibe is DQCl-hard. 

3.7 Conclusion 

The preceding sections show that the problem of approximating the Jones polynomial of 
the trace closure of a braid with n strands and m crossings to within ite at t = e^'^'^l'^ 
is a DQCl-complete problem for appropriate e. The proofs are based on the problem of 
evaluating the Markov trace of the Fibonacci representation of a braid to poiy^y^ precision. 
By equation 13.111 we see that this corresponds to evaluating the Jones polynomial with 
~*~ poiy(ra m) P^'scision, where D = —A"^ — A~'^ = 2cos(67r/5). Whereas approximating the 
Jones polynomial of the plat closure of a braid was known [4j to be BQP-complete, it was 
previously only known that the problem of approximating the Jones polynomial of the trace 
closure of a braid was in BQP. Understanding the complexity of approximating the Jones 



Such a completeness result improves our understanding of both the difficulty of the Jones 
polynomial problem and the power one clean qubit computers by finding an equivalence 
between the two. 

It is generally believed that DQCl is not contained in P and does not contain all of BQP. 
The DQCl-completeness result shows that if this belief is true, it implies that approximating 
the Jones polynomial of the trace closure of a braid is not so easy that it can be done 
classically in polynomial time, but is not so difficult as to be BQP-hard. 

To our knowledge, the problem of approximating the Jones polynomial of the trace clo- 
sure of a braid is one of only four known candidates for classically intractable problems 
solvable on a one clean qubit computer. The others are estimating the Pauli decomposition 
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of the unitary matrix corresponding to a polynomial-size quantum circuilH, [1181 1158j , esti- 
mating quadratically signed weight enumerators [1 19] • and estimating average fidelity decay 
of quantum maps |l3H 153]. 



3.8 Jones Polynomials by Fibonacci Representation 

For any braid b S Bn we will define Tr(6) by: 



"^^^ + ^"-1 set,. T 

We will use | to denote a strand and f to denote multiple strands of a braid (in this case 
n). Qn+i is the set of all strings of n + 1 p and * symbols which start with * and contain 
no two * symbols in a row. The symbol 

s 
b 



s 

denotes the s,s matrix element of the Fibonacci representation of braid b. The weight Wg 
is 

Ws = lt if ^ ends with p 
[1 if s ends with *. 

4> is the golden ratio (1 + \/5) / -v/2. 

As discussed in [6j , the Jones polynomial of the trace closure of a braid b is given by 

VhtriA-"^) = {-Af'"^''''^D'^-^TT{pA{b'')). (3.11) 

b^^ is the link obtained by taking the trace closure of braid b. w{b^^) is denotes the writhe of 
the link 6*''. For an oriented link, one assigns a value of +1 to each crossing of the form 
and the value —1 to each crossing of the form The writhe of a link is defined to be the 
sum of these values over all crossings. D is defined hy D = —A"^ — A'"^. pA ■ Bn — > TL„(Z)) 
is a representation from the braid group to the Temperley-Lieb algebra with parameter D. 
Specifically, 

PA{(Ti) = AEi + A-H (3.12) 
where Ei . . . En are the generators of TL„,(D), which satisfy the following relations. 

EiEj = EjEi for|z-j|>l (3.13) 
EiEi±iEi = Ei (3.14) 
Ef = DEi (3.15) 



This includes estimating the trace of the unitary as a special case. 
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The Markov trace on TL„(L') is a linear map Tr : TL„(L') C which satisfies 

Tr(l) = 1 (3.16) 
Tr(Xy) = Tr(yX) (3.17) 

Tv{XEn-i) = -^Tr(X') (3.18) 

On the left hand side of equation 13.181 the trace is on TL„(Z)), and X is an element of 
TL„(Z)) not containing E^-i- On the right hand side of equation 13.181 the trace is on 
TL„_i(D), and X' is the element of TL„_i(L') which corresponds to X in the obvious way 
since X does not contain En-i- 



We'll show that the Fibonacci representation satisfies the properties implied by equations 
13.121 [XTSl 13.141 and 13.151 We'll also show that Tr on the Fibonacci representation satisfies 
the properties corresponding to 13.161 13.171 and 13.181 It was shown in [6] that properties 
l3.16l[3T7l and 13.181 along with linearity, uniquely determine the map Tr. It will thus follow 
that TvipPib)) = Tr{pA{b)), which proves that the Jones polynomial is obtained from the 
trace Tr of the Fibonacci representation after multiplying by the appropriate powers of D 
and {—A) as shown in equation l3.11[ Since these powers are trivial to compute, the problem 



of approximating the Jones polynomial at A 
this trace. 



e ^^'^/^ reduces to the problem of computing 



Tr is equal to the ordinary matrix trace on the subspace of strings ending in * plus (j) 
times the matrix trace on the subspace of strings ending in p. Thus the fact that the matrix 
trace satisfies property [3TT7] immediately implies that Tr does too. Furthermore, since the 
dimensions of these subspaces are fn-i and fn respectively, we see from equation 13.91 that 
Tr(l) = 1. To address property 13.181 we'll first show that 



Tr 



b 



V 




(3.19) 



for some constant 6 which we will calculate. We will then use equation 13.121 to relate 6 to 
D. 



Using the definition of Tr we obtain 



Tr 



b 



V 



fn4> + fi 



n-1 



seQn-2 



s p p * 

ill 



b 

s p p * 



s p 

+ 1 


* 


b 







s p * p 



+ E 



E 



s * p 

i-L 



b 



s * p * 



S p p p 

i-J, 



s p p p 



s * p p 



°^"^n-2 S * p p 



where Qn-2 is the set of length n — 2 strings of * and p symbols which begin with *, end 
with p, and have no two * symbols in a row. 
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Next we expand according to the braiding rules described in equations 13.31 and [3l 



1 



fn4> + fn-1 



/ s p * p 



E 

se<9n-2 



s p p p 



+ ci>e 



s p * p 



s p p * \ 



+ a 



s p p p 



s p p * 



s * p * 



V 



s * p p \ 



+ (f>a 



s * p * 



s * p p 



We know that matrix elements in which differing string symbols are separated by unbraided 
strands will be zero. To obtain the preceding expression we have omitted such terms. 
Simplifying yields 



fn + (pfn-i 



E 

s6<9n-2 



(pc V— 



s p p \ 
-t_ I \ 



+ {(pe + a) 



^ s p * 



s p p 



s * p 



seQ'. 



s * p 



By the definitions of A, a, b, and e, given in equation 13.51 we see that cpe + a = b + (pa. 
Thus the above expression simplifies to 



fn4>+ fn-l 

Now we just need to show that 



E 



^^Q'n-l s'* 



I 



b 

T 

sp 



(pc 



1 



1 



fn(t> + fn-l dfn-lCp+f, 



n-2 



and 



6e + a 



1 



fn4> + /n-l + /n-2 

The Fibonacci numbers have the property 

fn4> + fn-l _ ^ 
fn-l4' + /n-2 

for all n. Thus equations 13.201 and 13.211 are equivalent to 

(pc = ^(p 



(3.20) 
(3.21) 



and 



be + a 



(3.22) 
(3.23) 
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respectively. For A = e *37r/5 ^j-^ggg both yield 6 = A — 1. Hence 



TV 



/ LA 

b 



V 



1 



1 



S fn^l4' + fn-2 



b 

T 

s* 



+ E ■ 



I 



b 

T 

sp 



thus confirming equation 13.191 

Now we calculate D from 6. Solving [3.121 for Ei yields 

Ei = A-^pA{a^) - A~H (3.24) 

Substituting this into 13.181 yields 

TriX {A-'pA{ai) - A-H)) = ^MX) 

A-^TT{XpA{ai)) - A-^Tr{X) = ^Tt{X). 
Comparison to our relation Tr(X/)^(cJi)) = jTr(X) yields 

^6 ^ - D- 
Solving for D and substituting in ^ = e~^^'^^^ yields 

D = (j). 

This is also equal to —A^ — A~'^ consistent with the usage elsewhere. 

Thus we have shown that Tr has all the necessary properties. We will next show that 
the image of the representation p_p of the braid group i?„ also forms a representation of the 
Temper ley-Lieb algebra TL„(Z)). Specifically, Ei is represented by 



E,^A-'pP{a, 



A-H. 



(3.25) 



To show that this is a representation of TLn{D) we must show that the matrices described 
in equation 13.251 satisfy the properties 13.131 13.141 and 13.151 By the theorem of [6j which 
shows that a Markov trace on any representation of the Temperley-Lieb algebra yields the 
Jones polynomial, it will follow that the trace of the Fibonacci representation yields the 
Jones polynomial. 

Since pp is a representation of the braid group and aicrj = ajai for \i — j\ > 1, it 
immediately follows that the matrices described in equation 13.251 satisfy condition 13.131 
Next, we'll consider condition 13.151 By inspection of the Fibonacci representation as given 
by equation 13.41 we see that by appropriately ordering the basi^ we can bring pA^CTi) into 



^We will have to choose different orderings for different ffi's. 
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block diagonal form, where each block is one of the following 1 x 1 or 2 x 2 possibilities. 



a 



[b] 



c d 
d e 



Thus, by equation 13.25^ it suffices to show that 

2 



A- 



c d 
d e 



1 
1 



D A 



c d 
d e 



A- 



1 
1 



{A-^a - A'^ f = D{A-^a - A 



and 



[A'H-A 



-2\2 



D{A-^b- A 



-2\ 



i.e. each of the blocks square to D times themselves. These properties are confirmed by 
direct calculation. 



Now all that remains is to check that the correspondence 13.251 satisfies property 13.141 
Using the rules described in equation 13.41 we have 



b 
a 



e d 
a 
d c 



d 
c 



*p*p 

*PPP 
*pp* 

PPPP 

PP*P 

P*PP 

PPP* 
p*p* 



C 
d 



d 

e 



d 
c 



0a 



a 
h 



(Here we have considered all four subspaces unlike in equation 13.61 ) Substituting this into 
equation 13.251 yields matrices which satisfy condition 13.151 It follows that equation 13.251 
yields a representation of the Temperley-Lieb algebra. This completes the proof that 



*p*p 

*ppp 

*pp* 

PPPP 
pp*p 
p*pp 
PPP* 

p*p* 



for A 



3.9 Density of the Fibonacci representation 

In this section we will show that p"J^{Bn) is a dense subgroup of SU{fn-i) modulo phase, 
and that p^^{Bn) and p^\Bn) are dense subgroups of SU{fn) modulo phase. Similar 
results regarding the path model representation of the braid group were proven in [3] . Our 
proofs will use many of the techniques introduced there. 

We'll first show that pi^J{B4) modulo phase is a dense subgroup of SU{2). We can then 
use the bridge lemma from [4] to extend the result to arbitrary n. 

Proposition 2. pi^J{B4) modulo phase is a dense subgroup of SU{2). 
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Proof. Using equation 13.41 we have: 



/olt^(o-l) = /olt^(<T3) 



b 
a 



*ppp* 



pi* (0-2 J 



c d 
d e 



*p*p* 
*ppp* 



We do not care about global phase so we will take 

1 



p** (o-j) 



A 4- (")^ .\l//n-l 

det p** (o-jj 



p** (o-jj 



to project into SU{fn-i)- Thus we must show the group {A,B) generated by 



A 



'ab 



b 
a 



B 



ce 



(i2 



c d 
d e 



(3.26) 



is a dense subgroup of SU{2). To do this we will use the well known surjective homomor- 
phism (p : SU{2) — > 50(3) whose kernel is {±1} (c/. \X5\, pg. 276). A general element of 
SU (2) can be written as 



cos I - I 1 + z sin I - ] [xax + ycTy + za^] 



where a^., (Ty, are the Pauli matrices, and x, y, z are real numbers satisfying x^+y^ + z^ 
1. (/> maps this element to the rotation by angle Q about the axis 



X 

y 

z 



Using equations 13.261 and 13.51 one finds that 4){A) and (j){B) are both rotations by 7-k/5. 
These rotations are about different axes which are separated by angle 

= cos"^(2 - \/5) ~ 1.8091137886 . . . 

To show that pi^{B4) modulo phase is a dense subgroup of SU{2) it suffices to show that 
(j){A) and (p{B) generate a dense subgroup of 5*0(3). To do this we take advantage of the 
fact that the finite subgroups of 50(3) are completely known. 

Theorem 3. ( [15\ pg. 184) Every finite subgroup of SO (3) is one of the following: 

Ck : the cyclic group of order k 

D].: the dihedral group of order k 

T: the tetrahedral group (order 12) 

O: the octahedral group (order 24) 

I: the icosahedral group (order 60) 

The infinite proper subgroups of 50(3) are all isomorphic to 0(2) or SO(2). Thus, 
since (t){A) and 0(i?) are rotations about different axes, {cj}{A) , (p{B)) can only be 50(3) or 
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a finite subgroup of S0{3). If we can show that {(f)(A), 4>{B)) is not contained in any of the 
finite subgroups of S0{3) then we are done. 

Since (p{A) and (p{B) are rotations about different axes we know that {(j){A), 4>{B)) is not 
Ck or Dfc. Next, we note that R = (t){Af<j){Bf is a rotation by 2di2. By direct calculation, 
2^12 is not an integer multiple of 2iT/k for k = 1,2,3,4, or 5. Thus R has order greater 
than 5. As mentioned on pg. 262 of [105J, T, O, and / do not have any elements of order 
greater than 5. Thus, {(l){A) , (p(B)) is not contained in C, O, or /, which completes the 
proof. Alternatively, using more arithmetic and less group theory, we can see that 26i2 is 
not any integer multiple of In/k for any k < 30, thus R cannot be in T, O, or / since its 
order does not divide the order of any of these groups. □ 

(n) 

Next we'll consider pj,* for larger n. These will be matrices acting on the strings of length 
n + 1. These can be divided into those which end in pp* and those which end in *p*. The 

(n) 

space upon which p** acts can correspondingly be divided into two subspaces which are the 
span of these two sets of strings. From equation 13.41 we can see that pi^\ai) . . . /ol"^(o"„_3) 
will leave these subspaces invariant. Thus if we order our basis to respect this grouping 



of strings, pl"''(cTi) . . ■pl'*{o'n-3) will appear block-diagonal with a block corresponding to 
each of these subspaces. 

The possible prefixes of *p* are all strings of length n — 2 that start with * and end 
with p. Now consider the strings acted upon by pi* These have length n — 1 and must 
end in *. The possible prefixes of this * are all strings of length n — 2 that begin with 
* and end with p. Thus these are in one to one correspondence with the strings acted 
upon by pil^ that end in *p*. Furthermore, since the rules 13.41 depend only on the three 
symbols neighboring a given crossing, the block of pi^\cri) . . ■ pil^} (an-s) corresponding to 
the *p* subspace is exactly the same as pi" ) . . . pi" '^\an-3)- By a similar argu- 
ment, the block of pl*''(o"i) . . . pi^\an-3) corresponding to the pp* is exactly the same as 



(n-l) , 

P** (O-l; 



(n-l) , 
' P** (,<7n-3j 



For any n > 3, p** (o"n-2) will not leave these subspaces invariant. This is because the 
crossing (Jn-2 spans the (n — l)**^ symbol. Thus if the (n — 2)*^ and n^^ symbols are p, 
then by equation 13.41 pl*^ can flip the value of the (n — 1)**^ symbol. The n*^ symbol is 
guaranteed to be p, since the {n+lY^ symbol is the last one and is therefore * by definition. 
For any n > 3, the space acted upon by pi^\an-i) will include some strings in which the 
(n — 2)*^ symbol is p. 

As an example, for five strands: 



pi* (o-i ) 



4*^3 ) 



6 
a 
0a 

a 
e d 
Ode 



*P*PP* 
*PPPP* 
*PP*P* 

*P*PP* 
*PPPP* 
*PP*P* 

.(5) 



Pi* (0-2 J 



pi* (0-4) 



c 


d 


" 


*p*pp* 


d 


e 





*PPPP* 








a 


*PP*P* 


a 





" 


*P*PP* 





a 





*PPPP* 








b 


*PP*P* 



We recognize the upper 2x2 blocks of p**''((Ti), and pl*^((T2) from equation 13.61 The lower 
1x1 block matches pl*^((Ti) and pl*^(fT2), which are both easily calculated to be [a]. pi^J {(T3) 
mixes these two subspaces. 

We can now use the preceding observations about the recursive structure of {pl"''|n = 
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4, 5, 6, 7. . .} to show inductively that {Bn) forms a dense subgroup of SU{fn~i) for all 
n. To perform the induction step we use the bridge lemma and decoupling lemma from 

Lemma 1 (Bridge Lemma). Let C = A®B where A and B are vector spaces with dimS > 
dim A > 1. Let W S SU{C) he a linear transformation which mixes the subspaces A and 
B. Then the group generated by SU{A), SU{B), and W is dense in SU{C). 

Lemma 2 (Decoupling Lemma). Let G he an infinite discrete group, and let A and B he 
two vector spaces with dim(^) ^ d\m{B). Let pa ■ G ^ SU{A) and pb : G ^ SU{B) be 
homomorphisms such that Pa{G) is dense in SU{A) and Pb{G) is dense in SU{B). Then 
for any Ua G SU{A) there exist a series of G- elements a„ such that lim„_^oo Pa (c^n) = Ua 
and lim„__»oo Pb(on) = 1- Similarly, for any Ub G SU{B), there exists a series (3n £ G such 
that lim„^oo PaiPn) = 1 and lim„^oo PaiPn) = Ub- 

With these in hand we can prove the main proposition of this section. 
Proposition 3. For any n>3, p'^ {Bn) modulo phase is a dense subgroup of SU{fn-i)- 

Proof. As mentioned previously, the proof will be inductive. The base cases are n = 3 
and n = 4. As mentioned previously, p\J {ai) = plJ {02) = [a]. Trivially, these gener- 
ate a dense subgroup of (indeed, all of) SU{1) = {1} modulo phase. By proposition O 
/olt^(ci), and ptJ {cr2) generate a dense subgroup of SU{2) modulo phase. Now for induc- 
tion assume that pi^ ^\Bn-i) is a dense subgroup of SU( fn-2) and pi" ^^(i?„_2) is a dense 
subgroup of SU^fn-s)- As noted above, these correspond to the upper and lower blocks 
of piV {(Ti) . . . pi^\an~2)- Thus, by the decoupling lemma, pi^\Bn) contains an element 
arbitrarily close to [/ © 1 for any U G SU{fn-2) and an element arbitrarily close to 1 © C/ 
for any U G SU{fn-i,)- Since, as observed above, /ol*^(fT„_i) mixes these two subspaces, the 
bridge lemma shows that p^J^ {Bn) is dense in SU{fn-i)- □ 

Prom this, the density of pip"* and p^ easily follow. 

Corollary 1. p^I^ {Bn) and p^ {Bn) are dense subgroups of SU{fn) modulo phase. 
Proof. It is not hard to see that 

pV(^i) = P** (^1) 

(n) , \ (n+l), N 

P*p (,0"n-lj — P** \<^n~l) 

As we saw in the proof of proposition [3l pl*^^'*((T„) is not necessary to obtain density in 
SU (fn), that is, {pi^^^\cri), . . . , pl*^^^((T„_i)) is a dense subgroup of SU (fn) modulo phase. 
Thus, the density of pip'* in SU {fn) follows immediately from the proof of proposition [3l 
By symmetry, p^^ {Bn) is isomorphic to pi^{Bn), thus this is a dense subgroup of SU{fn) 
modulo phase as well. □ 

3.10 Fibonacci and Path Model Representations 

For any braid group Bn, and any root of unity e*^'^/'^, the path model representation is a 
homomorphism from S„ to to a set of linear operators. The vector space that these linear 
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operators act is the space of formal linear combinations of n step paths on the rungs of 
a ladder of height k — 1 that start on the bottom rung. As an example, all the paths for 
n = 4, k = 5 are shown in below. 




Thus, the n = 4,k = 5 path model representation is on a five dimensional vector space. For 
k = 5 we can make a bijective correspondence between the allowed paths of n steps and 
the set of strings of p and * symbols of length n + 1 which start with * and have no to * 
symbols in a row. To do this, simply label the rungs from top to bottom as *, p, p, *, and 
directly read off the symbol string for each path as shown below. 




In [B], it is explained in detail for any given braid how to calculate the corresponding linear 
transformation on paths. Using the correspondence described above, one finds that the 
path model representation for /c = 5 is equal to the —1 times Fibonacci representation 
described in this paper. This sign difference is a minor detail which arises only because [6] 
chooses a different fourth root of t for A than we do. This sign difference is automatically 
compensated for in the factor of (_^)3-writho^ ^i^g^i both methods yield the correct Jones 
polynomial. 



3.11 Unitaries on Logarithmically Many Strands 

In this section we'll prove the following proposition. 



Proposition 1. Given any pair of elements U^p £ SU{fk+i) and U^:^ E SU{fk), and 
any real parameter e, one can in polynomial time find a braid b £ Bk with poly(n, log(l/e)) 
crossings whose Fibonacci representation satisfies \\p*pib) — U^p\\ < e and — || < e, 

provided that k = O(logn). By symmetry, the same holds when considering pp^, rather than 
P*p ■ 

To do so, we'll use a recursive construction. Suppose that we already know how to 
achieve proposition [T] on n symbols, and we wish to extend this to n + 1 symbols. Using 
the construction for n symbols we can efficiently obtain a unitary of the form 



B 



* . 



*p* 

PP* 
*PP 
PPP 
P*P 



(3.27) 



where A and B are arbitrary unitaries of the appropriate dimension. The elementary 
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crossing (T„ on 



the last two strands has the representatior0 



Mr, 



e d 
d c 



* 



PP* 
*PP • 
PPP 
P*P 



As a special case of equation 13. 27^ we can obtain 
Mdiag(a) 



Ja/2 



-ia/1 



*p* 
PP* 
*PP 
PPP 
P*P 



Where < a < 27r. We'll now show the following. 
Lemma 3. For any element 



Vll Vl2 

V21 V22 



G SU{2), 



one can find some product P of 0{1) Mjiag matrices and Mn matrices such that for some 
phases and (p2, 



Vu V12 
V21 V22 



Proof. Let -Bdiag(cK) and Bn be the following 2x2 matrices 



-Bdiag(a) 



gia/2 

e-'"/2 



and B„ 



. *p* 
. PP* 
. *pp 

■PPP 

. p*p 



We wish to show that we can approximate an arbitrary element of SU{2) as a product 
of 0(1) -Bdiag and Bn matrices. To do this, we will use the well known homomorphism 
(j) : SU{2) SO{3) whose kernel is {±1} (see section [319]) . To obtain an arbitrary element 
V of SU{2) modulo phase it suffices to show that the we can use (j){Bn) and (/>(i?diag(o)) to 
obtain an arbitrary SO{3) rotation. In section \3M we showed that 



a 
b 



and 



e d 
d c 



correspond to two rotations of Tvr/S about axes which are separated by an angle of 612 — 
1.8091137886... By the definition of (/>(-Bdiag(a)) is a rotation by angle a about the 



same axis that 



a 
h 



rotates about. 4>{B^) is a vr rotation. Hence, R{a) 



*Here and throughout this section when we write a scalar a in a block of the matrix we really mean al 
where I is the identity operator of appropriate dimension. 
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4>{B^Bdia.giot)B^) is a rotation by angle a about an axis which is separated by ang 2^12— vr 
from the axis that (p{Bdia.g{a)) rotates about. Q = R{TT)(l){Bdia.g{a)) R{tt) is a rotation by an- 
gle a about some axis whose angle of separation from the axis that 0(-Bdiag(o)) rotates about 
is 2(26*12 — 71) ~ 0.9532. Similarly, by geometric visualization, (/>(-Bdiag(o'))Q</'(-^diag(— «')) is 
a rotation by a about an axis whose angle of separation from the axis that Q rotates about 
is anywhere from to 2 x 0.9532 depending on the value of a' . Since 2 x 0.9532 > tt/2, there 
exists some choice of a' such that this angle of separation is 7r/2. Thus, using the rotations 
we have constructed we can perform Euler rotations to obtain an arbitrary rotation. □ 



As a special case of lemma [3l we can obtain, up to global phase, 



swap 



1 

1 



* . 

* . 

* . 

* . 

* . 



. *p* 

pp* 
*pp 
ppp 
p*p 



Similarly, we can produce M^J^-p. Using Ms^ap; Afswapi ^^'^ equation 13.271 we can produce 
the matrix 



Mc 



C 



*p* 

pp* 
*PP 
PPP 
P*P 



for any unitary C. We do it as follows. Since C is a normal operator, it can be unitarily 
diagonalized. That is, there exists some unitary U such that UCU~^ = D for some diagonal 
unitary D. Next, note that in equation 13.271 the dimension of B is more than half that of 
A. Let d = dim(^) — dim(i?), and let be the identity operator of dimension d. We 
can easily construct two diagonal unitaries Di and D2 of dimension dim(i?) such that 
(Di © Id){Id © D2) = D. As special cases of equation 13.271 we can obtain 



1 



* 
* 

* 



*P=f: 
PP* 
*PP 
PPP 
P*P 



and 



Do 



* 
* 
* 
* 
* 



*p* 

PP* 
*PP 
PPP 
P*P 



^We subtract tt because the angle between axes of rotation is only defined modulo tt. Our convention is 
that these angles are in [0, tt). 



83 



and 



P 



Mp 



P 



* . 

* . 

* . 

* . 



•PP* 
•*PP 
•PPP 
•P*P 



where P is a permutation matrix that shifts the lowest dim(i?) basis states from the bottom 
of the block to the top of the block. Thus we obtain 



M2 = Ms^^pMd,M-J^p 



* . 

* . 



!|SP* 

PP* 
*pp 
PPP 
p*p 



and 



Thus 



M1M2 



D 



As a special case of equation 13.271 we can obtain 



* . 



^p* 

PP* 
*PP . 
PPP 
P*P 



*p* 
PP* 
*PP 
PPP 
P*P 



Mu 



U 



U 



* . 

* . 

* . 



*p* 

PP* 
*PP . 
PPP 
P*P 



Thus we obtain Mq by the construction Mq = MuMiM2M^^ . By multiplying together Mc 
and Mn-i we can control the three blocks independently. For arbitrary unitaries A,B,C 
of appropriate dimension we can obtain 



A 



Macb 



C 



Si 



*p* 
•PP* 

*PP 
•PPP 
■P*P 



(3.28) 



84 



As a special case of equation 13.281 we can obtain 



unphase 



Thus, we obtain a clean swap 



* 

* . 



-^clean — -^unphase -^^swap 



1 

1 



, *p* 

• pp* 
*pp 
■ ppp 

• p*p 



* . 

* . 

* . 

* . 

* . 



*p* 

pp* 
*pp 
ppp 
p*p 



(3.29) 



We'll now use Mciean and Macb as our building blocks to create the maximally general 
unitary 



V 






w 



* . 



>l=p* 

pp* 
*pp 
ppp 
p*p 



(3.30) 



For n + 1 symbols, the * . . . * pp subspace has dimension /„_3, and the * . . .p* p and 
* . . .ppp subspaces each have dimension fn-2- Thus, in equation 13.281 the block C has di- 
mension /n-2 + /n-3 = fn-i, and the block B has dimension fn-2- To construct Mgen(K W) 
we will choose a subset of the basis states acted upon by the B and C blocks and per- 
mute them into the C block. Then using M^cB, we'll perform an arbitrary unitary on these 
basis states. At each such step we can act upon a subspace whose dimension is a constant 
fraction of the dimension of the entire fn dimensional space on which we wish to apply 
an arbitrary unitary. Furthermore, this constant fraction is more than half. Specifically, 
fn/fn-i — !/(/> — 0.62 for large n. We'll show that an arbitrary unitary can be built up as 
a product of a constant number of unitaries each of which act only on half the basis states. 
Thus our ability to act on approximately 62% of the basis states at each step is more than 
sufficient. 

Before proving this, we'll show how to permute an arbitrary set of basis states into the 
C block of Macb- Just use Mdean to swap the B block into the * . . . ppp subspace of the C 
block. Then, as a special case of equation 13.281 choose A and B to be the identity, and C 
to be a permutation which swaps some states between the * . . . *pp and * . . . ppp subspaces 
of the C block. The states which we swap up from the * . . . ppp subspace are the ones 
from B which we wish to move into C. The ones which we swap down from the * . . . * pp 
subspace are the ones from C which we wish to move into B. This process allows us to 
swap a maximum of fn-3 states between the B block and the C block. Since fn-s is more 
than half the dimension of the B block, it follows that any desired permutation of states 
between the B and C blocks can be achieved using two repetitions of this process. 

We'll now show the following. 

Lemma 4. Let m by divisible by 4- Any m x m unitary can be obtained as a product of 
seven unitaries, each of which act only on the space spanned bym/2 of the basis states, and 
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leave the rest of the basis states undisturbed. 



It will be obvious from the proof that even if the dimension of the matrix is not divisible 
by four, and the fraction of the basis states on which the individual unitaries act is not 
exactly 1/2 it will still be possible to obtain an arbitrary unitary using a constant number 
of steps independent of m. Therefore, we will not explicitly work out this straightforward 
generalization. 



Proof. In [137] it is shown that for any unitary U, one can always find a series of unitaries 
L„, . . . , Li which each act on only two basis states such that L„ . . . Li?7 is the identity. Thus 
Ln . . . Li = U^^. It follows that any unitary can be obtained as a product of such two level 
unitaries. The individual matrices Li, . . . ,L„ each perform a (unitary) row operation on 
U. The sequence ■ ■ ■ Li reduces U to the identity by a method very similar to Gaussian 
elimination. We will use a very similar construction to prove the present lemma. The 
essential difference is that we must perform the two level unitaries in groups. That is, we 
choose some set of m/2 basis states, perform a series of two level unitaries on them, then 
choose another set of m/2 basis states, perform a series of two level unitaries on them, and 
so on. After a finite number of such steps (it turns out that seven will suffice) we will reduce 
U to the identity. 

Our two-level unitaries will all be of the same type. We'll fix our attention on two 
entries in U taken from a particular column: Uik and Ujk- We wish to perform a unitary 
row operation, i.e. left multiply by a two level unitary, to set Ujk = 0. If Uik and Ujk are 
not both zero, then the two-level unitary which acts on the rows i and j according to 



1 



V\UkF+\Ujk\ 



IT* U* 
Ujk —Uik 



(3.31) 



will achieve this. If Uik and Ujk are both zero there is nothing to be done. 

We can now use this two level operation within groups of basis states to eliminate matrix 
elements of U one by one. As in Gaussian elimination, the key is that once you've obtained 
some zero matrix elements, your subsequent row operations must be chosen so that they do 
not make these nonzero again, undoing your previous work. 

As the first step, we'll act on the top m/2 rows in order to reduce the upper-left quadrant 
of U to upper triangular form. We can do this as follows. Consider the first and second 
entries in the first column. Using the operation 13.311 we can make the second entry zero. 
Next consider the first and third entries in the first column. By operation 13.311 we can 
similarly make the third entry zero. Repeating this procedure, we get all of the entries 
in the top half of the first column to be zero other than the top entry. Next, we perform 
the same procedure on the second column except leaving out the top row. These row 
operations will not alter the first column since the rows being acted upon all have zero in 
the first column. We can then repeat this procedure for each column in the left half of U 
until the upper-left block is upper triangular. 

We'll now think of U in terms of 16 blocks of size (m/4) x (m/4). In the second step 
we'll eliminate the matrix elements in the third block of the first column. The second step 
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is shown schematically as 




The curly braces indicate the rows to be acted upon, and the unshaded areas represent 
zero matrix elements. This step can be performed very similarly to the first step. The 
nonzero matrix elements in the bottom part of the first column can be eliminated one by 
one by interacting with the first row. The nonzero matrix elements in the bottom part of 
the second column can then be eliminated one by one by interacting with the second row. 
The first column will be undisturbed by this because the rows being acted upon in this step 
have zero matrix elements in the first column. Similarly acting on the remaining columns 
yields the desired result. 



The next step, as shown below, is nearly identical and can be done the same way. 

{ 

{ 

The matrix on the right hand side of 13. 321 is unitary. It follows that it must be of the form 




where the upper-leftmost block is a diagonal unitary. We can next apply the same sorts of 
steps to the lower 3x3 blocks, as illustrated below. 




By unitarity the resulting matrix is actually of the form 



where the lower-right quadrant is an (m/2) x (m/2) unitary matrix, and the upper- left 
quadrant is an {m/2) x (m/2) diagonal unitary matrix. We can now apply the inverse of 
the upper-left quadrant to the top m/2 rows and then apply the inverse of the lower-right 
quadrant to the bottom m/2 rows. This results in the identity matrix, and we are done. In 
total we have used seven steps. □ 
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Examining the preceding construction, we can see the recursive step uses a constant 
number of the M„_i operators from the next lower level of recursion, plus a constant number 
of Mn operators. Thus, the number of crossings in the braid grows only exponentially in 
the recursion depth. Since each recursion adds one more symbol, we see that to construct 
Mgen{V, W) on logarithmically many symbols requires only polynomially many crossings in 
the corresponding braid. 

The main remaining task is to work out the base case on which the recursion rests. 
Since the base case is for a fixed set of generators on a fixed number of symbols, we can 
simply use the Solovay-Kitaev theorem |116j . 

Theorem 4 (Solovay-Kitaev). Suppose matrices Ui,...,Ur generate a dense subgroup in 
SU{d). Then, given a desired unitary U £ SU{d), and a precision parameter 5 > 0, there 
is an algorithm to find a product V of Ui, . . . ,Ur and their inverses such that \\V — U\\ < 6. 
The length of the product and the runtime of the algorithm are both polynomial in log(l/(5). 

Because the total complexity of the process is polynomial, it is only necessary to im- 
plement the base case to polynomially small 5 in order for the final unitary Mge^{V,W) 
to have polynomial precision. This follows from simple error propagation. An analogous 
statement about the precision of gates needed in quantum circuits is worked out in [137j . 
This completes the proof of proposition [H 

3.12 Zeckendorf Representation 

Following [111], to construct the Fibonacci representation of the braid group, we use strings 
of p and * symbols such that no two * symbols are adjacent. There exists a bijection z 
between such strings and the integers, known as the Zeckendorf representation. Let P„ be 
the set of all such strings of length n. To construct the map z : P„ — > {0, 1, . . . , fn+2} we 
think of * as one and p as zero. Then, for a given string s = s„,s„_i . . . si we associate the 
integer 

n 

z{s) = '^Sifi+i, (3.33) 

i=l 

where fi is the i^^ Fibonacci number: /i = 1, /2 = 1, /s = 2, and so on. In this section we'll 
show the following. 

Proposition 4. For any n, the map z : P^, ^ {0, . . . , fn+2} defined by z{s) = Ylll=i ^ifi+i 
is bijective. 

Proof. We'll inductively show that the following two statements are true for every n >2. 

An : z maps strings of length n starting with p bijectively to {0, . . . , fn+i — 1}- 

Bn : z maps strings of length n starting with * bijectively to {/„,+i, . . . , fn+2 — !}■ 

Together, An and Bn imply that z maps P„ bijectively to {0, . . . , fn+2 — !}• As a base 
case, we can look at n = 2. 

pp ^ 
p* <^ 1 
*p <-> 2 
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Thus A2 and B2 are true. Now for the induction. Let Sn-i £ Pn-i- By equation 13.331 

z{pSn-l) = z{Sn-l). 

Since Sn-i follows a p symbol, it can be any element of Pn-i- By induction, z is bijective 
on Pn-i, thus An is true. Similarly, by equation 13.331 

z{*Sn-l) = fn+1 + z{Sn-l). 

Since Sn-i here follows a *, its allowed values are exactly those strings which start with 
p. By induction, An-i tells us that z maps these bijectively to {0, ...,/„ — 1}. Since 
fn+i + fn = fn+2, this implies Bn is true. Together, An and En for all n > 2, along with 
the trivial n = 1 case, imply proposition [H □ 
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Chapter 4 

Perturbative Gadgets 



4.1 Introduction 



Perturbative gadgets were introduced to construct a two-local Hamiltonian whose low en- 
ergy effective Hamiltonian corresponds to a desired three-local Hamiltonian. They were 
originally developed by Kempe, Kitaev, and Regev in 2004 to prove the QMA-completeness 
of the 2-local Hamiltonian problem and to simulate 3-local adiabatic quantum computation 
using 2-local adiabatic quantum computation [113]. Perturbative gadgets have subsequently 
been used to simulate spatially nonlocal Hamiltonians using spatially local Hamiltonians ■ 
and to find a minimal set of set of interactions for universal adiabatic quantum computation [27]. 
It was also pointed out in |140| that perturbative gadgets can be used recursively to obtain 
fc-local effective interactions using a 2-local Hamiltonian. Here we generalize perturbative 
gadgets to directly obtain arbitrary fc-local effective interactions by a single application of 
^th Qj.(^gj. perturbation theory. Our formulation is based on a perturbation expansion due 
to Bloch[M]. 

A A:-local operator is one consisting of interactions between at most k qubits. A general 
/c-local Hamiltonian on n qubits can always be expressed as a sum of r terms, 

r 

^comp^^^^^^ (4.1) 
s=l 

with coefficients Cg, where each term Hs is a fc-fold tensor product of Pauli operators. That 
is, Hg couples some set of k qubits according to 

Hs = (Ts,l (Ts,2 ••• crs,fc, (4.2) 
where each operator cjsj is of the form 

= "sj ■ ^sj-, (4.3) 

where unit vector in M^, and Bsj is the vector of Pauli matrices operating on the 

j^^ qubit in the set of k qubits acted upon by Hg- 

We wish to simulate H'^""^^ using only 2-local interactions. To this end, for each term 
Hg, we introduce k ancilla qubits, generalizing the technique of |ll3j . There are then rk 
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Figure 4-1: The ancilla qubits are all coupled together using ZZ couplings. This gives a 
unit energy penalty for each pair of unaligned qubits. If there are k bits, of which j are in 
the state |1) and the remaining k — j are in the state |0), then the energy penalty is j{k — j). 
In the example shown in this diagram, the 1 and labels indicate that the qubits are in the 
state 1 0001), which has energy penalty 3. 



as 



ancilla qubits and n computational qubits, and we choose the gadget HamiltoniarQ ; 

r r 

^gad ^ ^anc + A ^ ^^Vs , (4.4) 



s=l s=l 

where 

jjanc ^ 

~ 2 



Hr= E kl-Zs,Zs,), (4.5) 



l<i<j<k 

and 

k 

= Ecj,j®X,j. (4.6) 

i=i 

For each s there is a corresponding register of k ancilla qubits. The operators Xgj and 
Zgj are Pauli X and Z operators acting on the j^^ ancilla qubit in the ancilla register 
associated with s. For each ancilla register, the ground space of H^^^'^ is the span of |000 . . .) 
and |111 . . .). A is the small parameter in which the perturbative analysis is carried out. 
For each s, the operator 

Xf' = Xs,2 ® . . . ® Xs,k (4.7) 

acting on the k ancilla qubits in the register s commutes with ifsad gij^ce there are r ancilla 
registers, H^'^'^ can be block diagonalized into 2*" blocks, where each register is in either the 
+1 or —1 eigenspace of its Xf^. In this paper, we analyze only the block corresponding 
to the +1 eigenspace for every register. This +1 block of the gadget Hamiltonian is a 
Hermitian operator, that we label H^'^. We show that the effective Hamiltonian on the 
low energy eigenstates of H^'^ approximates H^°^^. For many purposes this is sufficient. 
For example, suppose one wishes to simulate a A;-local adiabatic quantum computer using 
a 2-local adiabatic quantum computer. If the initial state of the computer lies within the 
all +1 subspace, then the system will remain in this subspace throughout its evolution. To 
put the initial state of the system into the all +1 subspace, one can initialize each ancilla 
register to the state 

|+) = i=(|000...) + |lll...)), (4.8) 



^For _ffsad Hermitian, the coefficient of Vs must be real. We tlierefore choose the sign of each hsj 

so that all Cs are positive. 
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which is the ground state of H^'^^ within the +1 subspace. Given the extensive experi- 
mental hterature on the preparation of states of the form 1+), also known as cat states, a 
supply of such states seems a reasonable resource to assume. 



The purpose of the perturbative gadgets is to obtain A;-local effective interactions in the 
low energy subspace. To quantify this, we use the concept of an effective Hamiltonian. We 
define this to be 

d 

i/eff(^,d) (4.9) 

where , . . . , {ipd) are the d lowest energy eigenstates of a Hamiltonian H, and Ei, . . . , 
are their energies. 



In section HTSl we calculate Heg{H^ ,2"-) perturbatively to k^^ order in A. To do this, 
we write ijsad 

where 

r 

s=l 

and 

r 
s=l 

We consider H^^^ to be the unperturbed Hamiltonian and AV^ to be the the perturbation. 
We find that XV perturbs the ground space of H^^^'^ in two separate ways. The first is to 
shift the energy of the entire space. The second is to split the degeneracy of the ground 
space. This splitting arises at k^^ order in perturbation theory, because the lowest power 
of XV that has nonzero matrix elements within the ground space of H'^^'^ is the k^^ power. 
It is this splitting which allows the low energy subspace of H^'^ to mimic the spectrum of 

^comp 



It is convenient to analyze the shift and the splitting separately. To do this, we define 

H,s{H, d, A) = H,s{H, d) - AH, (4.13) 

where H is the projector onto the support of Hef[{H,d). Thus, H(,{i{H,d, A) differs from 
Hes{H,d) only by an energy shift of magnitude A. The eigenstates of Heg{H,d, A) are 
identical to the eigenstates of HesiH,d), as are all the gaps between eigenenergies. The 
rest of this paper is devoted to showing that, for any /c- local Hamiltonian //™™p acting on 
n qubits, there exists some function /(A) such that 

H^siHf", 2", /(A)) = zMz^^comp ^p^^ (414) 

for sufficiently small A. Here P+ is an operator acting on the ancilla registers, projecting 
each one into the state 1+). To obtain equation 14.141 we use a formulation of degenerate 
perturbation theory due to BlochpSl I128j . which we describe in the next section. 
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4.2 Perturbation Theory 



Suppose we have a Hamiltonian of the form 

H = + XV, (4.15) 

where H^^^ has a d-dimensional degenerate ground space of energy zero. As discussed 
in [1101 1128j . the effective Hamiltonian for the d lowest eigenstates of H can be obtained 
directly as a perturbation series in V . However, for our purposes it is more convenient to 
use an indirect method due to Bloch [28 ^ I128j . which we now describe. As shown in section 
14.61 the perturbative expansions converge provided that 

\\\V\\ < |, (4.16) 

where 7 is the energy gap between the eigenspace in question and the next nearest eigenspace, 
and II • II denotes the operator norrr|§. 

Let iV'i) , • • • , \ipd) be the d lowest energy eigenstates of H, and let Ei, . . . ,Ed be their 
energies. For small perturbations, these states lie primarily within S^^h Let 

|a,)=Po|^i), (4.17) 

where Pq is the projector onto For A satisfying 14.161 the vectors |ai) , . . . , \ ad) are 

linearly independent, and there exists a linear operator U such that 

lipj) = U \aj) for j = l,2,...,d (4.18) 

and 

U\(l))=0 for |(/.)Gf(°)^. (4.19) 
Note that U is in general nonunitary. Let 

A = XPqVU. (4.20) 

As shown in jl281 [28] and recounted in section H31 the eigenvectors of A are |ai) , . . . , \ ad), 
and the corresponding eigenvalues are Ei, . . . , Ed- Thus, 

H^s=UAU^. (4.21) 

A and U have the following perturbative expansions. Let S*' be the operator 

y \, ifZ>0 

m ^ ) (4.22) 

-Po if 1 = 

where Pj is the projector onto the eigenspace of with energy (Recall that E^^^ = 



For any linear operator M, 

||M||= max KV'IMIV) 
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0.) Then 

oo 

^=^^M, (4.23) 

m=l 

where 

^(m) = A™ ^ PqVS^'VS^' . . . VS^"^-WPo, (4.24) 

(m-l) 

and the sum is over all nonnegative integers li . . . Im-i satisfying 

h + ... + Im-i = m-l (4.25) 
li + ... + lp > p (p= l,2,...,m-2). (4.26) 

Similarly, U has the expansion 

oo 

Z^ = Po + ^Z^M, (4.27) 



m=l 



where 



and the sum is over 



= A*" ^ S^'VS^W . . . VS^-^VPo, (4.28) 

(m) 



+ + = m (4.29) 
h + .-. + lp > p (p = 1,2,..., m-l). (4.30) 

In section 14.51 we derive the expansions for U and A, and in section 14.61 we prove that 
condition 14. 161 suffices to ensure convergence. The advantage of the method of [28j over the 
direct approach of [llOj is that A is an operator whose support is strictly within which 
makes some of the calculations more convenient. 



4.3 Analysis of the Gadget Hamiltonian 

Before analyzing H^'^'^ for a general A;-local Hamiltonian, we first consider the case where 

^comp j^g^g Qj^g coefficient Cg = 1 and all the rest equal to zero. That is, 

//™-P = aia2 . . . cTfc, (4.31) 

where for each j, aj = hj ■ ffj for some unit vector hj in M'^. The corresponding gadget 
Hamiltonian is thus 

where 

^anc^ ^ ^(/-^.^,), (4.33) 

l<i<j<k 

and 

k 

F = J]f7j®Xj-. (4.34) 
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Here aj acts on the j*^ computational qubit, and Xj and Zj are the Pauh X and Z 
operators acting on the j^^ ancilla qubit. We use k^^ order perturbation theory to show 
that H^^ {H^'^ ,2^^ , A) approximates i/comp appropriate A. 

We start by calculating A for H^'^. For H'^^'^ , the energy gap is 7 = k — 1, and ||y|| = k, 
so by condition 14.161 we can use perturbation theory provided A satisfies 

A<t^. (4.35) 

Because all terms in A are sandwiched by Pq operators, the nonzero terms in A are ones in 
which the m powers of V take a state in £^^^ and return it to £^^^ . Because we are working 
in the +1 eigenspace of X®^ , an examination of equation 14.331 shows that £^^^ is the span 
of the states in which the ancilla qubits are in the state |+). Thus, Pq = I ® P+, where P+ 
acts only on the ancilla qubits, projecting them onto the state |+). Each term in V flips one 
ancilla qubit. To return to £^^\ the powers of V must either flip some ancilla qubits and 
then flip them back, or they must flip all of them. The latter process occurs at k^^ order 
and gives rise to a term that mimics f/'comp^ The former process occurs at many orders, 
but at orders k and lower gives rise only to terms proportional to Pq. 

As an example, let's examine A up to second order for k > 2. 

^(^^2) = APo^^'o + X^PqVSWPo (4.36) 

The term PqVPq is zero, because V kicks the state out of £^^\ By equation 14.341 we see 
that applying V to a state in the ground space yields a state in the energy k — 1 eigenspace. 
Substituting this denominator into yields 

A'^'^ = --^PoV^Po. (4.37) 

Because V is a sum, V'^ consists of the squares of individual terms of V and cross terms. 
The cross terms flip two ancilla qubits, and thus do not return the state to the ground 
space. The squares of individual terms are proportional to the identity, thus 

^(2) = A^asPo (4.38) 

for some A-independent constant 02- Similarly, at any order m < k, the only terms in V"^ 
which project back to £^^^ are those arising from squares of individual terms, which are 
proportional to the identity. Thus, up to order k — 1, 

^(^'~')= (^a^X^^Po (4.39) 

where the sum is over even m between zero and k — 1 and oq, ■ ■ • are the corresponding 
coefficients. 

At k^^ order there arises another type of term. In there are A;-fold cross terms in 
which each of the terms in V appears once. For example, there is the term 

X^Poiai ® Xi)S\a2 «> ^2)5^ ...S\ak0 Xk)Po (4.40) 
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The product of the energy denominators occurring in the S operators is 



Thus, this term is 

{{k-iy.f 
which can be rewritten as 



n t = AZ}I . (4.41) 



Poiai Xi){a2 0X2)... (ak <E) Xk)Po, (4.42) 



'^^ Po{aia2...ak(S)X^'')Po. (4.43) 



This term mimics H^°™i' , The fact that all the S operators in this term are 5"^ is a general 
feature. Any term in A^^^ where li . . . Ik-i are not all equal to 1 either vanishes or is 
proportional to Pq. This is because such terms contain Pq operators separated by fewer 
than k powers of V ^ and thus the same arguments used for m < k apply. 



There are a total of k\ terms of the type shown in expression 14.401 Thus, up to k^^ order 

-k{-\f 



A^^^^ = f{X)Po + 



{k-l)l 



■Po{aia2...ak^X'^'')Po, 



which can be written as 



(4.44) 



(4.45) 



where /(A) is some polynomial in A. Note that, up to k^^ order, A happens to be Hermitian. 
The effective Hamiltonian is UAU'^ , thus by equation 14.451 



Feff(//f^2^') = Uf{X)PoU^+U 
= f{X)U + U 
since UPoU^ = n. Thus, 



-fc(-A)^' 
k{-X)'' 



Pq{H™ ® X^'')Po + OiX^^') 



{k-iy. 



(4.46) 



H,s{Hf'',2\f{X))=U 



-kj-XY 
{k-l)\ 



k+l\ 



(4.47) 



To order A'^, we can approximate lA as Pq since the higher order corrections to lA give rise 
to terms of order A'^^^ and higher in the expression for H^siH^'^ , 2^^, /(A)). Thus, 



i/eff(i^f^2^/(A)) 



-fc(-A)* 

Jk^^ 



PoiH''°'^P X^'')Po + 0{X 



(4.48) 



97 



Using Pq = I P+ we rewrite this as 



H.siHf", 2^ /(A)) = -^|L^//comp ^p_^^ 0(A'^+i). (4.49) 

Now let's return to the general case where H^°^p is a linear combination of /c-local terms 
with arbitrary coefficients Cs , as described in equation 14. 1[ Now that we have gadgets to 
obtain /c-local effective interactions, it is tempting to eliminate one A:-local interaction at a 
time, by introducing corresponding gadgets one by one. However, this approach does not 
lend itself to simple analysis by degenerate perturbation theory. This is because the different 
A;-local terms in general act on overlapping sets of qubits. Hence, we instead consider 

r 
s=l 

as a single perturbation, and work out the effective Hamiltonian in powers of this operator. 
The unperturbed part of the total gadget Hamiltonian is thus 

r 

^anc^^^anc^ (4.51) 
s=l 

which has energy gap 7 = /c — 1. The full Hamiltonian is 

^gad ^ ^anc ^ ^ygad^ (4 52) 

SO the perturbation series is guaranteed to converge under the condition 

k — 1 

A < ^„^^ ,„ (4.53) 

4||l/gad|| ^ ^ 

As mentioned previously, we will work only within the simultaneous +1 eigenspace of the 
X^^ operators acting on each of the ancilla registers. In this subspace, H^'^'^ has degeneracy 
2" which gets split by the perturbation A^ so that it mimics the spectrum of H'^°^v 

Each Vs term couples to a different ancilla register. Hence, any cross term between 
different Vs terms flips some ancilla qubits in one register and some ancilla qubits in another. 
Thus, at /c*^ order, non-identity cross terms between different s cannot flip all k ancilla 
qubits in any given ancilla register, and they are thus projected away by the Pq operators 
appearing in the formula for A. Hence the perturbative analysis proceeds just as it did 
when there was only a single nonzero c^, and one flnds, 

^efr(^f',2",/(A)) = ~l^Z% P^ \y^CsHs®Xf^\ Po + 0(A'=+^), (4.54) 



\s=l 



where Xf^ is the operator X®^ acting on the register of k ancilla qubits corresponding to 
a given s, and /(A) is some polynomial in A of degree at most k. Note that coefficients in 
the polynomial /(A) depend on _ ^s before, this can be rewritten as 

H.siHf'XJW) = ~^^_7^j,' g^°'"PcgP+ + 0(A^+^), (4.55) 
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Figure 4-2: Here the ratio of the error terms to the ideal Hamiltonian W = (^i^^iy 
is plotted. We examine three examples, a third order gadget simulating a single XYZ 
interaction, a third order gadget simulating a pair of interactions XYZ + XYY, and a 
fourth order gadget simulating a fourth order interaction XYZZ. Here i^eff is calculated 
by direct numerical computation without using perturbation theory. As expected the ratio 
of the norm of the error terms to H^'^ goes linearly to zero with shrinking A. 



where P+ acts only on the ancilla registers, projecting them all into the 1+) state. Hence, as 
asserted in section 14.11 the 2-local gadget Hamiltonian H^^'^ generates effective interactions 
which mimic the /c-local Hamiltonian i^^omp^ -^yg expect that this technique may find many 
applications in quantum computation, such as in proving QMA-completeness of Hamilto- 
nian problems, and constructing physically realistic Hamiltonians for adiabatic quantum 
computation. 



4.4 Numerical Examples 

In this section we numerically examine the performance of perturbative gadgets in some 
small examples. As shown in section the shifted effective Hamiltonian is that given in 
equation 14.551 We define 

^id _ -fc(-A) ^,„^p ^ 

[k-iy. 

Hcs consists of the ideal piece W^, which is of order A'^, plus an error term of order A'^^^ and 
higher. For sufficiently small A these error terms are therefore small compared to the 
term which simulates H^""^P, Indeed, by a calculation very similar to that which appears in 
section 14.61 one can easily place an upper bound on the norm of the error terms. However, 
in practice the actual size of the error terms may be smaller than this bound. To examine 
the error magnitude in practice, we plot - — in figure 14-21 using direct numerical 

computation of H^s without perturbation theory. /(A) was calculated analytically for these 
examples. In all cases the ratio of \\H^'^ — HesW to ||-f/^^'^|| scales approximately linearly with 
A, as one expects since the error terms are of order A'^"*'^ and higher, whereas H^'^ is of order 
A^ 
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4.5 Derivation of Perturbative Formulas 



In this section we give a self-contained presentation of the derivations for the method of 
degenerate perturbation theory used in this paper. We closely follow B loch [28]. Given a 
Hamiltonian of the form 

H = + W (4.57) 

we wish to find the effective Hamiltonian induced by the perturbation on the ground 
space of H'^^\ In what follows, we assume that the ground space of has energy zero. 
This simplifies notation, and the generalization to nonzero ground energy is straightforward. 
To further simplify notation we define 

V = \V. (4.58) 

Suppose the ground space of H^^'^ is d-dimensional and denote it by £^^^ . Let | V'l) , • • • , | V'd) 
be the perturbed eigenstates arising from the splitting of this degenerate ground space, and 
let El, . . . , Ed be their energies. Furthermore, let \aj) = Pq where Pq is the projector 
onto the unperturbed ground space of H^^\ If A is sufficiently small, \ai) , . . . , \ad) are 
linearly independent, and we can define an operator U such that 

U \aj) = (4.59) 

and 

U\^) = V|(/.) G (4.60) 



Now let A be the operator 



A = PqVU. (4.61) 



A has |ai) , . . . , \ ad) as its eigenstates, and Ei, . . . , E^ as its corresponding energies. To see 
this, note that since has zero ground state energy 

PoV = Po(i^^°^ +V) = PqH. (4.62) 

Thus, 

A\aj) = PoVU\aj) 

= PoVl-ipj) 

= PoH\i;j) 

= PoEjli^j) 

= Ej \aj) . (4.63) 

The essential task in this formulation of degenerate perturbation theory is to find a pertur- 
bative expansion for U. From U one can obtain A by equation 14.611 By diagonalizing A 
one obtains Ei, . . . , Ed, and \ai) , . . . , |ad). Then, by applying U to \aj) one obtains l^Jj). 
So, given a perturbative formula for U, all quantities of interest can be calculated. Rather 
than diagonalizing A to obtain individual eigenstates and eigenenergies, one can instead 
compute an effective Hamiltonian for the entire perturbed eigenspace, defined by 

d 

H,s{H,d) = Y,Ej\^j){ijj\. (4.64) 
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This is given by 

Hcs{H,d) =UAUl (4.65) 
To derive a perturbative formula for U, we start with Schrodinger's equation: 

H\^j) = Ej\^j). (4.66) 

By equation 14.621 left-multiplying this by Pq yields 

PoV\4^j) = E,\aj). (4.67) 

By equation 14.601 

UPo = U. (4.68) 
Thus left-multiplying equation 14.671 by U yields 

UV\^j)=Ej\^j). (4.69) 
By subtracting 14.69 1 from HT66] we obtain 

{H -UV)\i)j) =0. (4.70) 
The span of \tjjj) we call £. For any state \(3) va. £ we have 

{H -UV)\I3) = Q. (4.71) 
Since hl\^) ^ £■ for any state I7), it follows that 

[H -UV)U = 0. (4.72) 

This equation can be rewritten as 

H^o)U = -VU + UVU. (4.73) 

Defining Qq = \ — Pq have 

U = PqU + QoU. (4.74) 
Substituting this into the left side of 14.731 yields 

H^^^QqU = -VU + UVU, (4.75) 

because F^Pq = 0. In f W-^, i^W has a weU defined inverse and one can write 

QoU = -^Q^{VU-UVU). (4.76) 

Using equation 14.741 one obtains 

U = PoU- j^QoiVU - UVU). (4.77) 
By the definition oiU it is apparent that PqU = Pq, thus this equation simplifies to 

U = Po- j^QoiVU - UVU). (4.78) 
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Figure 4-3: From a given m-tuple {li,l2, ■ ■ ■ ,lm) we construct a corresponding stairstep 
diagram by making the j^^ step have height Ij, as illustrated above. 



We now expand U in powers of A (equivalently, in powers of V), and denote the m}^ 
order term by U^"^\ Substituting this expansion into equation 14.781 and equating terms at 
each order yields the following recurrence relations. 



Qo 



m—1 



(4.79) 

(m = 1,2,3...) (4.80) 



Note that the sum over p starts at p = 1, not p = 0. This is because 



Let 



^{»") ig of the form 



5' 



QoPo = 0. 



1 -Qo if^>0 
if / = 



-Po 



(4.81) 
(4.82) 
(4.83) 
(4.84) 

h+l2 + .■■ + lm = m. (4.85) 

The proof is an easy induction. U^^^ clearly satisfies this, and we can see that if Z^(^) has 
these properties for all j < m, then by recurrence 14.801 U^^^ also has these properties. 

All that remains is to prove that the subset of allowed m-tuples appearing in the sum 
Y^' are exactly those which satisfy 



where Yl' is a sum over some subset of m-tuples {li,l2, ■ ■ ■ ,lm) such that 

k>0 (i = 1,2, ...,m) 



h + ... + lp>p {p=l,2, 



, m 



1). 



(4.86) 



Following [25], we do this by introducing stairstep diagrams to represent the m-tuples, 
as shown in figure 14-31 The m-tuples with property 14.861 correspond to diagrams in which 
the steps lie above the diagonal. Following [28] we call these convex diagrams. Thus our 
task is to prove that the sum is over all and only the convex diagrams. To do this, we 
consider the ways in which convex diagrams of order m can be constructed from convex 
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Figure 4-4: A convex diagram must have either = 1 or /i > 1. In either case, the diagram 
can be decomposed as a concatenation of lower order convex diagrams. 



diagrams of lower order. We then relate this to the way U^"^^ is obtained from lower order 
terms in the recurrence 14.801 

In any convex diagram, h > 1. We now consider the two cases h = 1 and li > 1. In the 
case that li = 1, the diagram is as shown on the left in figure l44l In any convex diagram 
of order m with li = 1, there is an intersection with the diagonal after one step, at the 
point that we have labelled c. The diagram from c to 6 is a convex diagram of order m — 1. 
Conversely, given any convex diagram of order m — 1 we can construct a convex diagram of 
order m by adding one step to the beginning. Thus the convex diagrams of order m with 
h = 1 correspond bijectively to the convex diagrams of order m — 1. 

The case /i > 1 is shown in figure [131 on the right. Here we introduce the line from a' 
to b' , which is parallel to the diagonal, but higher by one step. Since the diagram must end 
at 6, it must cross back under a'b' at some point. We'll label the first point at which it does 
so as c' . In general, c' can equal b' . The curve going from a' to c' is a convex diagram of 
order p with 1 < p < m — 1, and the curve going from c to 6 is a convex diagram of order 
n — p—1 (which may be order zero if d = b'). Since c' exists and is unique, this establishes 
a bijection between the convex diagrams of order m with Zi > 1, and the set of the pairs of 
convex diagrams of orders p and n — p — 1, for 1 < p < n — 1. 

Examining the recurrence 14.801 we see that the h = 1 diagrams are exactly those which 
arise from the term 

(4.87) 

and the li > 1 diagrams are exactly those which arise from the term 

m—1 



^^(p)y^^(«-p-i). (4.88) 



p=i 

which completes the proof that is over the m-tuples satisfying equation 14.861 

4.6 Convergence of Perturbation Series 

Here we show that the perturbative expansion for U given in equation 14.271 converges for 

||Ay|| < |. (4.89) 
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By equation 14.201 the convergence of U also implies the convergence of A. Applying the 
triangle inequality to equation 14.271 yields 



oo 



< 1 + ^ ll^^^"^)!!. (4.90) 



m=l 



Substituting in equation 14.281 and applying the triangle inequality again yields 

oo 

ll^ll < 1+ ^ A™^ ...yS'^yPoll. (4.91) 

m=l (m) 

By the submultiplicative property of the operator norm, 

oo 

\\U\\ < 1+ ^y^^WS^^W ■ \\V\\...\\V\\ ■ • \\V\\ ■ llPoll- (4.92) 

m=l {m) 

1 1 Pol I = 1) and by equation 14.221 we have 

\\S'\\ = ^(7^ = -r (4.93) 
Since the sum in equation 14.921 is over + + = m, we have 



< 



7 

m=l (m) ' 



The sum ^^{m) ^^^^ ^ subset of the m-tuples adding up to m. Thus, the number of terms 
in this sum is less than the number of ways of obtaining m as a sum of m nonnegative 
integers. By elementary combinatorics, the number of ways to obtain n as a sum of r 
nonnegative integers is ("^^~^), thus 



< 

m=l 



°° ^2m-l\||Ay| 



Since 

i=o V J / 

we have 

-,2m- 1 



< 2^"'-\ (4.97) 



'2m - V 
m 

Substituting this into equation 14.951 converts it into a convenient geometric series: 



II AI/II"^ 

< 1 + ^ 2^""-^ " J . (4.98) 

m=l 
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This series converges for 

4\\\V 



7 
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Chapter 5 

Multiplicity and Unity 



The fox knows many things but the hedgehog knows one big thing. 
-Archilochus 

In his essay "The Hedgehog and the Fox," Isaiah Berhn, echoing Archilochus, classified 
thinkers into two categories, hedgehogs, who look for unity and generality, and foxes, who 
revel in the variety and complexity of the universe. In this chapter I will revisit the content 
of my thesis from each of these points of view. 

5.1 Multiplicity 

...you shall not add to the misery and sorrow of the world, but shall smile to the infinite 
variety and mystery of it. 
-Sir William Saroyan 

In this thesis I have spoken about many models of computation, especially the adiabatic, 
topological, and circuit models. Each of these models of computation can solve exactly the 
same set of problems in polynomial time. Thus one might ask: "why bother"? Why not 
just stick to the quantum circuit model? 

As the many examples in this thesis have shown, sticking to only one model of quantum 
computation would be a mistake. The most obvious justification for considering alternative 
models of quantum computation is that they provide promising alternatives for the physical 
implementation of quantum computers. As discussed in chapters [1] and [2l the main barrier 
to practical quantum computation is the effect of noise. Three types of quantum computer 
show promise for overcoming this barrier: quantum circuits, by means of active error cor- 
rection, adiabatic quantum computers, by their inherent indifference to local properties, 
and topological quantum computers by their energy gap, as discussed in chapter [2j It is 
not yet clear which of these strategies will prove most useful, and this provides justification 
for investigating all of them. 

A skeptic might protest that this only justifies investigation of physically realistic models 
of quantum computation. Some of the models discussed in the literature, and in this thesis, 
are not very physically realistic. For example, it seems unlikely that adiabatic computation 
with 4-local Hamiltonians, or quantum walks on exponentially many nodes will be imple- 
mented in laboratories. However, even models slightly removed from physical practicality 
have proven useful in the development of practical physical implementations. For example, 
we now know that adiabatic quantum computation with 2-local Hamiltonians is universal. 
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This was originally proven by showing that 5-local Hamiltonians are computationally uni- 
versal, and then making a series of reductions from 3-local down to 2-local. Even now, the 
best known direct universality proof is for 3-local adiabiatic quantum computation [133]. 

There is a second, less obvious justification for considering multiple models of quan- 
tum computation. Although all of quantum algorithms could in principle be formulated 
using the quantum circuit model, this is not how quantum computation actually developed. 
The quantum algorithms for factoring and searching were discovered using the quantum 
circuit model. The quantum speedups for NAND tree evaluation and simulated annealing 
were discovered using quantum walks. The quantum algorithm for approximating Jones 
polynomials was discovered using topological quantum computation. History has shown us 
that a particular model of quantum computation gives us a perspective from which certain 
quantum algorithms are easily visible, while they remain obscure from the perspective of 
other models. 

Having argued the virtues of having a multiplicity of models of quantum computation, 
the time has come to put this principle into action. That is, I will propose a direction 
for further research based on the premise that formulating additional models of quantum 
computation is useful even if the models are not directly practical. 

One striking thing about the topological model of quantum computation is its indiffer- 
ence to the details of the manipulations of the particles. Only the topology of the braiding 
matters, and the specific geometry is irrelevant. Let's now push this a step further and 
consider a model where even the topology is irrelevant, and the only thing that matters is 
how the particles were permuted. Just as the braiding of anyons induces a representation 
of the braid group, we analogously expect that the permutation of the particles induces a 
representation of the symmetric group. 

Such a model is not entirely without physical motivation. The exchange statistics of 
Bosons and Fermions are exactly the two one-dimensional representations of the symmetric 
group. Particles with exchange statistics given by a higher dimensional representation 
of the symmetric group have been proposed in the past. This is called parastatistics. 
Such particles have never been observed. However, based on the presently understood 
physics, they remain an exotic, but not inconcievable possibility [142] . Furthermore, many 
Hamiltonians in nature are symmetric under permutation of the particles. If a Hamiltonian 
has symmetry group G, then its degenerate eigenspaces will transform as representations 
of G, and these representations will generically be irreducible[9l]. 

More precisely, suppose a Hamiltonian H onn particles has a d-fold degenerate eigenspace 
spanned by 0i(xi, . . . , Xn), ■ ■ ■ , (pdi^i, . . . , Xn)- We start with a given state within that space 

d 

Ipixi, ...,Xn) = ^aj(j)j{xi, ...,Xn) 

i=i 

Then, if we permute the particles according to some permutation vr we obtain some other 
state ^{Xj^(^i^, . . . , x^(^n))- Because the Hamiltonian is permutation symmetric, this state will 
lie within the same eigenspace. That is, there are some coefficients Pi, . . . , l3d such that 

d 

V'(X^(1) , . . . , ) = '^Pj<j)j{xi,...,Xn). 

i=i 

It is clear that the dependence of Pi, . . . , (3d on ai, . . . , is linear. Thus there is some 
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matrix M'^ corresponding to permutation vr: 

d 

i=i 

It is not hard to see that the mapping from permutations to their corresponding matrices is a 
group homomorphism. That is, these dxd matrices form a representation of the symmetric 
group Sn- If the H has no other symmetries, then this representation will generically be 
ir reducible [91] ■ 

As advocated above, I will not worry too much about the physical justification for the 
model, but instead consider just two questions. First, does the model lend itself to the the 
rapid solution of any interesting computational problems, and second, can it be efficiently 
simulated by quantum circuits? If both answers are yes, then we obtain new quantum 
algorithms. 

The question of what problem this model of computation lends itself to has an obvious 
answer: the computation of representations of the symmetric group. There are many good 
reasons to restrict our consideration to only the irreducible representations. Any finite group 
has only finitely many irreducible representations but infinitely many reducible represen- 
tations. These reducible representations are always the direct sum of multiple irreducible 
representations. Thus, by performing a computation with a reducible representation we 
would merely be performing a superposition of computations with irreducible representa- 
tions. 

In chapter [3] we saw that if we can apply a representation of the braid group then, 
using the Hadamard test, we can estimate the matrix elements of this representation to 
polynomial precision. Furthermore, by sampling over the diagonal matrix elements we can 
estimate the normalized trace of the representation to polynomial precision. If we instead 
have a representation of the symmetric group, the situation is precisely analogous. The 
trace of a group representation is called its character. Characters of group representations 
have many uses not only in mathematics, but also in physics and chemistry. Note that, 
unlike the matrix elements of a representation, its character is basis independent. 

Just as with anyonic quantum computation, we can imagine the particles initially po- 
sitioned along a line. We then allow ourselves to swap neighbors. The runtime of an 
algorithm is the number of necessary swaps. Interestingly, in the parastistical model of 
computation, no algorithm has runtime more than O(n^), because there are only finitely 
many permutations and each of them can be constructed using at most O(n^) swaps. 

To formulate the concrete computational problems we'll need to delve a little bit into 
the specifics of the irreducible representations of the symmetric group. The irreducible 
representations of Sn are indexed by the Young diagrams of n boxes. These are all the 
possible partitions of the n boxes into rows, where the rows are arranged in descending 
order of length. The example n = 4 is shown in figure I5-1[ The matrix elements of these 
representations depend on a choice of basis. For our purposes it is essential that the basis 
be chosen so that the representation is unitary. The most widely used such basis is called 
the Young- Yamanouchi basis[91j. In this basis the irreducible representations are sparse 
orthogonal matricees. For the irreducible representation corresponding to a Young diagram 
A, the Young- Yamanouchi basis vectors correspond to the set of standard Young tableaux 
compatible with lambda. These are all the numberings of boxes so that if we added the 
boxes in this order, the configuration would be a valid Young diagram after every step. This 
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Figure 5-1: The Young diagrams with four boxes. They correspond to the irreducible 
representations of 54. 
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Figure 5-2: Above we show an example Young tableau, and beneath it the corresponding 
sequence of Young diagrams from left to right. 



is illustrated in figure 15-21 It is not hard to see that for some A, the number of standard 
tableaux, and hence the dimension of the representation, is exponential in n. 

Thus we can state the following computational problems regarding Sn, which are solv- 
able in poly(n) time in the parastatistical model of computation. 

Problem: Calculate an irreducible representation for the symmetric group Sn- 

Input: A Young diagram specifying the irreducible representation, a permutation from 

Sn, sl pair of standard Young tableaux indicating the desired matrix element, and a 

polynomially small parameter e. 

Output: The specified matrix element to within ite. 

Problem: Calculate a character for the symmetric group Sn- 

Input: A Young diagram A specifying the irreducible representation, a permutation tt 
from Sn, and a polynomially small parameter e. 

Output: Let x\{'^) be the character, and let dx be the dimension of the irreducible 
representation. The output is Xx{'^)/dx to within ±e. 

Next, we must find out how hard these problems are classically. If classical polynomial 
time algorithms for these problems are known then we have not discovered anything in- 
teresting. Without looking into the classical computation literature there are already two 
things we can say. First, as noted above, for some Young diagrams of Sn, the correspond- 
ing representation has dimension exponential in n. Thus the above problems cannot be 
solved classically in polynomial time by directly using matrix multiplication. Second, the 
irreducible representations have both positive and negative matrix elements in all of the 
bases discussed in standard references |9H [29]. Thus interference effects are important, so 
naive Markov chain methods will not work. 

To go beyond these simple observations, we must consult the relevant experts and body 
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of literature. As shown in [95], the problem of exactly evaluating the characters of the 
irreducible representations of the symmetric group is #P-complete. This rules out the pos- 
sibility that some ingenious closed form expression for the characters is buried in the math 
literature somewhere. The characters are obtained in the parastatistical model of compu- 
tation to only polynomial precision. This corresponds to the precision one could obtain 
by classically sampling from the diagonal matrix elements of the representation. Thus, the 
most likely scenario by which these algorithms could fail to be interesting is if the individ- 
ual matrix element of the irreducible representations of the symmetric group are easy to 
compute classically, and the only reason that computing the characters is hard is that there 
are exponentially many of these matrix elements to add up. However, the sources I have 
consulted do not provide any way to obtain matrix elements of the irreducible representa- 
tions of the Sn in poly(n) time for arbitrary permutations [129} I9 H [29]. Furthermore, there 
is a body of work on how to improve the efficiency of exponential time algorithms for this 
problem [175[ 1176] [M ] [59] . Unless this entire body of work is misguided, no polynomial-time 
methods for computing such matrix elements were known when these papers were written. 

A completeness result would provide even stronger evidence of the difficulty of these 
problems. The problems of estimating Jones polynomials discussed in section [3] were each 
BQP or DQCl complete. In general one could conjecture that for any representation dense 
in a unitary group of exponentially large dimension, the problem of estimating matrix 
elements to polynomial precision is BQP-complete and the problem of estimationg normal- 
ized characters to polynomial precision is DQCl-complete. However, because the symmetric 
group is finite, no representation of it can be dense in a continuous group. Thus it seems un- 
likely that the problems of estimating the matrix elements and characters of the symmetric 
group are BQP-complete or DQCl-complete. Furthermore, the fact that no parastatisti- 
cal algorithm on n particles requires more than O(n^) computational steps makes it seem 
unlikel}0 that this model is universal. 

Next we must see whether the parastatistical model of computation can be simulated in 
polynomial time by standard quantum computers. If so we obtain two new polynomial time 
quantum algorithms apparently providing exponential speedup over known classical algo- 
rithms. Normally the search for quantum algorithms is a pursuit fraught with frustration. 
However, in this case we have a win-win situation. If the parastatistical model cannot be 
simulated by quantum computers in polynomial time, then instead of quantum algorithms 
we have a physically plausible model of computation not contained in BQP, which is also 
very exciting. 

A detailed examination of the Young- Yamanouchi matrices in |91j makes me fairly con- 
vinced that the parastatistical model of computation can be simulated in polynomial time 
by quantum circuits. Specifically, it appears that this can be done very analogously to the 
implementation of the Fibonacci representation of the braid group in chapter [3l However, 
this is not the place to discuss such details. Instead I will now switch teams, and take the 
side of the hedgehog. 



Probably one could prove a precise no-go theorem along these lines using the Heirarchy theorem for 
BQP. (See chapter [T]) 
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5.2 Unity 



. . . the world will somehow come clearer and we will grasp the true strangeness of the 
universe. And the strangeness will all prove to be connected and make sense. 
-E.O. Wilson 

Upon examining the catalogue of quantum algorithms in chapter [H a striking pattern 
emerges. Although the quantum algorithms are superficially widely varied, the exponential 
speedups for non-oracular problems generally fall into two broad families: the speedups 
obtained by reduction to hidden subgroup problems, and the speedups related to knot 
invariants. Interestingly, both of these families of speedups rely on representation theory. 
The speedups for the hidden subgroup related problems are based on the Fourier transform 
over groups. Such a Fourier transform goes between the computational basis, and a basis 
made from matrix elements of irreducible representations. The speedups for the evaluation 
of knot invariants are based on implementating a unitary representation of the braid group 
using quantum circuits. It is therefore tempting to look for some grand unification to unite 
all exponential speedups into one representation-theoretic framework. 

I do not know whether such a grand unification is possible, nevermind how to carry it 
out. However, I will offer a bold speculation as to a possible route forward. Rather than 
starting with the Fourier transform, lets first consider the Schur transform. Like the Fourier 
transform, the Schur transform can be efficiently implented using quantum circuits [92j, and 
has applications in quantum information processing. Furthermore, it has a more obvious 
connection to multiparticle physics than does the Fourier transform. For example, suppose 
we have two spin-1/2 particles. The Hilbert space of quantum states for these spins is four 
dimensional. One basis we can choose for this Hilbert space is obtained by taking the tensor 
product of cT^-basis for each spin: 

IT) IT) 
IT)li) 
li)IT) 
li)li) 

Another basis can be obtained as the simultaneous eigenbasis of the total angular momen- 
tum ui^Vi^^ + ay^^al^^ + cri^VP'' and the total azimuthal angular momentum ui^^ + a^z'^ . 
This basis is 

IT) IT) 

^(IT)li) + li)IT)) 
li)li) 

;)5(IT)li) + li)IT)) 

The Schur transform in this case is just the unitary change of basis between these two bases. 
The general case is explained in [22], and is fairly analogous. 

The matrix elements appearing in this change of basis are known as Clebsch-Gordon 
coefficients, and are tabulated in most undergraduate quantum mechanics textbooks. Ac- 
cording to [136j . the coefficients appearing in the fusion rules of a topological quantum 
field theory are essentially a generalization of Clebsch-Gordon coefficients. Thus, I offer the 
conjecture that one could implement the Schur transform directly using the fusion rules 
of some TQFT. (Since some TQFTs are universal for quantum computation, and Schur 
transforms can be efficiently computed, one could always take the circuit for finding Schur 
transforms and convert it into some extremely complicated braiding of anyons. This is not 
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what we are looking for.) 

The Schur transform is related to the Fourier transform over the symmetric group [92]. 
Thus, if a direct TQFT implementation of the Schur transform were found then one could 
next look for a direct anyonic implementations of Fourier transforms. I therefore propose the 
conjecture that the two classes of quantum algorithms correspond to the two components 
of a topological quantum field theory: the knot invariant algorithms correspond to the 
braiding rules, and the hidden subgroup problems correspond to the fusion rules. 
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Appendix A 

Classical Circuit Universality 



Consider an arbitrary function / from n bits to one bit. / can be specified by list- 
ing each bitstrings x G {0, 1}" such that f{x) = 1. To show the universahty of the 
{AND, NOT, OR, FANOUT} gate set, we wish to construct a circuit out of these gates 
that implements /. We'll diagrammatically represent these gates using 



AND OR NOT FANOUT 




From two-input AND gates we can construct an m-input AND gate for any m. The example 
m = 4 is shown below. 




An m-input OR gate can be constructed from 2-input OR gates, and an m-output FANOUT 
gate can be constructed from 2-output FANOUT gates similarly. The m-input AND gate 
accepts only the string 111 . . .. To accept a different string, one can simply attach NOT 
gates to all the inputs which should be 0. Using AND and NOT gates, one can thus make 
a circuit to accept each bitstring accepted by /. These can then be joined together using 
a multi-input OR circuit. FANOUT gates are used to supply the inputs to each of these. 
The resulting circuit simulates / as shown in figure 

To implement a function with multiple bits of output, one can consider each bit of 
output to be a separate single-bit Boolean function of the inputs, and construct a circuit 
for each bit of output accordingly. 
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Figure A-1: The Boolean circuit witli four bits of input and one bit of output which accepts 
only the inputs 0110, 1011, and 1111 is implemented by the shown circuit. Any Boolean 
function can be implemented similarly, as described in the text. 
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Appendix B 

Optical Computing 



Quantum mechanics bears a close resemblance to classical optics. An optical wave associates 
an amplitude with each point in space. A quantum wavefunction associates an amplitude 
with each possible configuration of the system. Whereas an optical wave is a function of at 
most three spatial variables, a quantum wavefunction of a system of particles is a function 
of all the parameters needed to describe the configuration of the particles. Thus classical 
optics lacks the exponentially high-dimensional state space of quantum mechanics. The 
similarity between classical optics and quantum mechanics is of course no coincidence, as 
photons are governed by quantum mechanics. Essentially, classical optics treats the case 
where these photons are independent and unentangled, so that the intensity of light on at 
a given point on the detector is simply proportional to the probability density for a single 
photon to be detected at that point if it were sent through the optical apparatus by itself. 

Fourier transforms are an important primitive in quantum computing, and lie at the 
heart of the factoring algorithm, and several other quantum algorithms. As shown in [160j . 
quantum computers can perform Fourier transforms on n-qubits using 0{n^) gate£]. 
sider the computational basis states of n-qubits as corresponding to the numbers {0, 1, . . . , 2"— 
1} via place value. The quantum Fourier transform on n qubits is the following unitary 
transformation . 

2"-l 2"-12"-l 

U, I-) = 4^ E E e''-^'^'"a{.) \k) . 

x=0 ^ k=0 1-0 

The Fourier can be performed optically with a single lens, as shown in figure IB-ll 
Before digital computers became as powerful as they are today, people used to develop 
optical schemes for analog computation. Many of them were based in some way on the 
optical Fourier transform. 

Another important primitive in quantum algorithms is phase kickback. Typically, an 
oracle Uf for a give function / acts according to 

Uf\x)\y) = \x)\{y + f{x)) mod 2^°), 

where Uq is the number of output bits. Normally, one chooses y = so that the output 
register contains /(x) after applying the oracle. However, if one instead prepares the output 



^This has subsequently been improved. As shown in [51) . approximate Fourier transforms can be per- 
formed on n qubits using 0{n) gates. 
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6{x + k) 



Figure B-1: A collimated beam shined at an angle toward a lens will have phases varying 
linearly across the face of the lens, due to the extra distance that some rays must travel. 
Such a beam gets focused to a point on the focal plane whose location depends on the 
angle at which the beam was shined onto the lens. Thus, the amplitude across the lens of 
gikx gg|.g transformed to an amplitude across the focal plane of 6{x + k). Since a lens is a 
linear optical device, any superposition of plane waves will be mapped to the corresponding 
superposition of delta functions. Thus the amplitude across the focal plane will be the 
Fourier transform of the amplitude across the face of the lens. 



register in the state 

f{x) will be written into the phase instead of the qubits: 

Uf \x) IV) = e*2-/(^)/2"° \x) . 
This is because, for the process of adding z modulo 2"°, is an eigenstate with eigenvalue 

gi27r^/2"o 

Phase kickback also has an optical analogue. Simply constuct a sheet of glass whose 
thickness is proportional to f{x). Because light travels more slowly through glass than 
through air, the beam experiences a phase lag proportional to the thickness of glass it passes 
through. Now consider the following optical "algorithm" . For simplicity we'll demonstrate 
it in two dimensions, althogh it could also be done in three. We are given a piece of glass 
whose thickness is given by a smooth function f{x). Because it is smooth, f{x) is locally 
linear, and so the sheet looks locally like a prism. By inserting this glass between two lenses, 
as shown in figure [B^ we can determine the angle of this prism [i.e. ^) by the location 
of the spot made on the detector. By the analogies discussed above, this has an alternative 
description in terms of Fourier transforms, and an analogous quantum algorithm, as shown 
in figure IB-31 

So far, this is an unimpressive accomplishment. Both determining the angle of a prism 
and approximating the derivative of an oracular function of one variable are easy tasks. 
Now recall the difference between optical and quantum computing. Namely, the quantum 
analogue can be extended to an arbitrary number of dimensions. In particular, a phase 
proportional to f{x) can easily be obtained using phase kickback, and a d-dimensional 
quantum Fourier transform can be achieved by applying a quantum Fourier transform to 
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Figure B-2: At the left, a point source produces light. This is focused into a collimated 
beam by a lens. The beam then passes through a glass sheet which deflects the beam. The 
second lens then focuses the beam to a point on the detector whose location depends on 
the thickness gradient of the glass sheet. 




A 
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Figure B-3: Intially one starts with a given basis state. This is then Fourier transformed to 
yield the uniform superposition. A phase shift proportional to f{x) is then performed. If 
f{x) is locally linear than, the resulting shifted state is a plane wave e^^ ^^^^ . The second 
Fourier transform then converts this to a basis state |/'(a;)). 
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each vector component. As a result, on a quantum computer, one can obtain the gradient 
of a function of d variables by generalizing the above algorithm. This requires only a single 
query to the oracle to do the phase kickback. In contrast, classically estimating the gradient 
of an oracular function requires at least d + 1 queries. This is described in |107j . 

A literature exists on analog optical computation. It might be interesting to investigate 
whether some existing optical algorithms have more powerful quantum analogues. Also, 
it seems that quantum computers would be naturally suited to the simulation of classical 
optics problems. Perhaps a quantum speedup could be obtained for optical problems of 
practical interest, such as ray tracing. 
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Appendix C 

Phase Estimation 



Phase estimation for unitary operators was introduced in |115j . and is explained nicely in 
[137) . Here I will describe this method, and discuss how it can be used to measure in the 
eigenbasis of physical observables. 

Suppose you have a quantum circuit of poly(n) gates on n qubits which implements 
the unitary transformation U. Suppose also you are given some eigenstate \ipj) of U. You 
can always efficiently estimate the corresponding eigenvalue to polynomial precision using 
phase estimation. The first step in constructing the phase estimation circuit is to construct 
a circuit for controlled- f7. 

Given an polynomial size quantum circuit for U, one can always construct a polynomial 
size quantum circuit for controlled- C/, which is a unitary U on n + 1 qubits defined by 



C/|0)|V) 
U\l)\^) 



\o)m 

\l)Uk 



for any n-qubit state {ip). One can do this by taking the quantum circuit for U and replacing 
each gate with the corresponding controlled gate. Each gate in the original circuit acts on 
at most k qubits, where k is some constant independent of n. Then, the corresponding 
controlled gate acts on A; -|- 1 qubits. By general gate universality results, any unitary on a 
constant number of qubits can be efficiently implemented. 

The phase estimation algorithm uses a series of controlled-C/'^'" operators for successive 
values of m. Given a quantum circuit for U, one can construct a quantum circuit for 
by concatenating k copies of the original circuit. The resulting circuit for U'' can then be 
converted into a circuit for controlled-C/'^ as described above. 

Now consider the following circuit, which performs phase estimation to three bits of 
precision. 

|0) - 



H 



|0) — 
|0) — 



H 



H 



u^-u^-u 



The top three qubits collectively form the control register. The bottom n qubits are ini- 
tialized to the eigenstate and form the target register. The initial array of Hadamard 
gates puts the control register into the uniform superposition. Thus the state of the n -|- 3 
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qubits after this step is 

Here we are using place value to make a correspondence between the strings of 3 bits and 
the integers from to 7. The contolled-C/^™ circuits then transform the state to 

^ x=\) 

iV'j) is an eigenstate of C/, thus this is equal to 

where e*^^ is the eigenvalue of Notice that the control register is not entangled with 
the target register. Performing an inverse Fourier transform on the control register thus 
yields 

where the notation [-J indicates rounding to the nearest integer. Thus we obtain with 
three bits of precision. 

Similarly, with h qubits in the control register, one can obtain 9j to b bits of precision. 
The necessary Hadamard and Fourier transforms require only poly(6) gates to perform. 
However, in order to obtain b bits of precision, one must have a circuit for controlled-f/^'' . 
The only known completely general way to construct a circuit for from a circuit for 
U is to concatenate 2^ copies of the circuit for U . Thus, the total size of the circuit for 
phase estimation will be on the order of 2^g + poly(5), where g is the number of gates in 
the circuit for U . Thus, 6j is obtained by the phase estimation algorithm to within ±2"^, 
and the dominant contribution to the total runtime is proportional to 2^. In other words, 
l/poly(n) precision can be obtained in poly(n) time. 

For some special U , it is possible to construct a circuit for U"^^ using poly(6) gates. For 
example, this is true for the operation of modular exponentiation. The quantum algorithm 
for factoring can be formulated in terms of phase estimation of the modular exponentiation 
operator, as described in chapter five of |137j . 

The phase estimation algorithm can be thought of as a special type of measurement. 
Suppose you are given a superposition of eigenstates of U . By linearity, the phase estimation 
circuit will perform the following transformation 

|00...) j;a,|^,)^^a, 

3 3 

By measuring the control register in the computational basis, one obtains the result [^^jj 
with probability |ajp. If the eigenvalues of U are separated by at least 2^^, then one has 
thus performed a measurement in the eigenbasis of U . 

Often, one is interested in measuring in the eigenbasis of some observable defined by a 
Hermition operator. For example, one may wish to measure in the eigenbasis of a Hamilto- 



r|^«.j>i*>. 
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nian, or an angular momentum operator. This can be done by combining the techniques of 
phase estimation and quantum simulation. As discussed in section 11.31 it is generally be- 
lieved that any physically realistic observable can be efficiently simulated using a quantum 
circuit. More precisely, for any observable one can construct a circuit for U = e*^*, where 
the number of gates is polynomial in t. Using this U in the phase estimation algorithm, one 
can measure eigenvalues of H to polynomial precision. 
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Appendix D 

Minimizing Quadratic Forms 



A quadratic form is a function of the form f{x) = Mx + h ■ x + c, where M is a d x d 
matrix, x and h are d-dimensional vectors, and c is a scalar. Here we consider M, x, 6, and 
c to be real. Without loss generality we may assume that M is symmetric. If M is also 
positive definite, then / has a unique minimum. In addition to its intrinsic interest, the 
problem of finding this minimum by making queries to a blackbox for / can serve as an 
idealized mathematical model for numerical optimization problems. 

Andrew Yao proved in 1975 that O(d^) classical queries are necessary to find the min- 
imum of a quadratic form [177] . In 2004, I found that a single quantum query suffices 
to estimate the gradient of a blackbox function, whereas a minimum of d + 1 queries are 
required classically [TUT]. One natural application for this is gradient based numerical opti- 
mization. In 2005, David Bulger applied quantum gradient estimation along with Grover 
search to the problem of numerically finding the minimum to an objective function with 
many basin-like local minima [36]. In the same paper he suggested that it would be inter- 
esting to analyze the speedup obtainable by applying quantum gradient estimation to the 
problem of minimizing a quadratic form. In this appendix, I show that a simple quantum 
algorithm based on gradient estimation can find the minimum of a quadratic form using 
only 0{d) queries, thus beating the classical lower bound. 

For the blackbox for / to be implemented on a digital computer, its inputs and outputs 
must be discretized, and represented with some finite number of bits. For present purposes, 
we shall ignore the "numerical noise" introduced by this discretization. In addition, on quan- 
tum computers, blackboxes must be implemented as unitary transformations. The standard 
way to achieve this is to let the unitary transformation Uj \x) \y) = \x) \ {y + f{x)) mod Nq) 
serve as the blackbox for /. Here \x) is the input register, \y) is the output register, and 
No is the allowed range of y. By choosing y = 0, one obtains f{x) in the output register. 
As discussed in appendix iBj by choosing 

|y)oc^e-|z), 

z 

one obtains Uj \x) \y) = e*-^^^) \x) \y), since, in this case, \y) is an eigenstate of addition mod- 
ulo No- This is a standard technique in quantum computation, known as phase kickback. 

There are a number of methods by which one can find the minimum of a quadratic form 
using 0{d) applications of quantum gradient estimation. Since each gradient estimation 
requires only a single quantum query to the blackbox, such methods only use 0{d) queries. 
One such method is as follows. First, we evaluate V/ at a; = 0. If M is symmetric, 
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V/ = 2Mx + h. Thus, this first gradient evaluation gives us h. Next we evaluate V/ at 
X = (1/2, 0, 0, . . -Y ■ This yields 2M(l/2, 0, 0, . . .)^ + After subtracting b wc obtain the 
first column of M. Similarly, we then evaluate V/ at (0, 1/2,0,0, • • .)"^ subtract h to 
obtain the second column of M, and so on, until we have full knowledge of M. Next we just 
compute —^M~^b to find the minimum of /. This process uses a total of d + 1 gradient 
estimations, and hence d+l quantum queries to the blackbox for /. Note that M is always 
invertible since by assumption it is symmetric and positive definite. 
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Appendix E 

Principle of Deferred Measurement 



The principle of deferred measurement is a simple but conceptually important fact which 
follows directly from the postulates of quantum mechanics |137j . It says: 

Any measurement performed in the course of a quantum computation can always be deferred 
until the end of the computation. If any operations are conditionally performed depending 
on the measurement outcome they can be replaced by coherent conditional operations. 



For example, consider a process on two qubits, where one qubit is measured in the compu- 
tational basis, and if it is found to be in the state |1) then the second qubit is flipped. 



X 



By the principle of deferred measurement, this is exactly equivalent to 



— e — 

The principle of deferred measurement allows us to immediately see that the measure- 
ment based model of quantum computation (as described in section [1.6. 5 p can be efficiently 
simulated by quantum circuits. In addition, the principle of deferred measurement im- 
plies that uniform families of quantum circuits generated in polynomial time by quantum 
computers are no more powerful than uniform families of quantum circuits generated in 
polynomial time by classical computers, as shown below. 

A quantum circuit can be described by series of bits corresponding to possible quantum 
gates. If a bit is 1 then the corresponding gate is present in the quantum circuit. Otherwise 
it is absent. Any quantum circuit on n bits with poly(n) gates chosen from a finite set can 
be described by poly(n) classical bits in this way. We can think of the quantum circuit 
as a series of classically controlled gates, one for each bit in the description. Now suppose 
these bits are generated by a quantum computer, which I'll call the control circuit. By 
the principle of deferred measurement, these classically controlled gates can be replaced by 
quantum controlled gates. We now consider the control circuit and these controlled gates 
together as one big quantum circuit. It is still of polynomial size, and it is controlled by 
the classical computer that generated the control circuit. Thus it is in BQP. 
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Appendix F 

Adiabatic Theorem 



F.l Main Proof 

This appendix give a proof of the adiabatic theorem due to Jeffrey Goldstone [81 



1 










)] 



Theorem 5. Let H{s) be a finite- dimensional twice differentiahle Hamiltonian on < s < 1 
with a nondegenerate ground state \(Pq{s)) separated by an energy energy gap 7(5). Let \ip{t)) 
be the state obtained by Schrddinger time evolution with Hamiltonian H{t/T) starting with 
state \4>o{0)) at t = 0. Then, with appropriate choice of phase for \(f)Q{t)), 

II mn - \MT)) II < ^ ||^lL=o + w II^IL=i + /o d« \m\" + 

Proof. Let £'o(0 be the ground state energy of H{t). Let Hf{t) = H{t) — £'o(t)l. It is easy 
to see that if \ip{t)) is the state obtained by time evolving an initial state |V'(0)) according 
to H{t) then 

is the state obtained by time evolving the initial state |V'(0)) according to Hf{t). Since 
these states differ by only a global phase, the problem of proving the adiabatic theorem 
reduces to the case where EQ{t) = for < t < T. We assume this from now on. Thus 

H{s) = 0, 

so 

;|W = -Gif|0„). (F.l) 

where 

G = EH^. (F.2) 

and \4>q{s)) , |</>i(s)) , |02(s)) , ... is an eigenbasis for H{s) with corresponding energies Eq[s) = 
0,Ei{s),E2{s),.... 

We start with the Schrodinger equation 

i^^\^)=Hit/T)m, 
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and rescale the time coordinate to obtain 



Tds 



His) 



The corresponding unitary evolution operator Ut{s,s') is the sohition of 

l.^UT{s,s') = His)UT{s,s') 
I as 

U{s,s) = l, 



and also satisfies 



Tds 



-Ut{s,s') = Ut{s,s')H{s'). 



We wish to bound the norm of 

5t 



\M^))-UTii,o)\Mo)) 
1 d 

— [UT{l,s)\Ms))]ds. 



Integration by parts yields 



T 



AH 



Ut{1,s)G'—\^o) 



- I ds Ut{1,s)^(G^^\4>o, 



Thus, by the triangle inequality 



II^tII < ^ 



^ d7 



+ 



^ d7 



+ ^ds 

=0 -'0 



, dH 
' ds \ ds 



dH 



(F.3) 



A straightforward calculation gives 

,d^H 



dH 



+G2-^|0o)-2G2— G— |0o) 
ds"' ds ds 

as shown in section [F\2l Thus, by the triangle inequality and submultiplicativity. 





dH 


2 


d^H 


< 5 






ds 


ds2 



(F.4) 



Substituting equation IF.4I into equation IF. 31 and noting that ||G|| = I/7 completes the 
proof. □ 



F.2 Supplementary Calculation 

In this section we calculate 




G as described in equation IF.2I is not convenient to work with. Roughly speaking, G 
represents the "operator" ^, where Q is the projector 

Q = l- {(Po) (0o| . 

However, H has a zero eigenvalue and is therefore not invertible. We can define 

H = H + e\(l)o) {M , 
where e is some arbitrary real constant. H is invertible and furthermore, 

G = QH~^ = H-^Q = QH-^Q. 

Thus, 

as as as 

as as as 



For any invertible operator M, 



Thus, 



as as 



as as as as 



e is an s-independent constant, so ^ = thus 



^ = ^G-G^G + G^. (F.5) 
as as as as 



Using 



t'oi - mi 



ds ds ds 

and equation IF.ll vields 

dQ r'^H dH 

-r- = G— \<po) m\ + \(Po) m\ -T-G- 
ds ds as 

Substituting this into equation IF. 51 yields 

^ = G'^ |<^o> (.^ol + m (0o| ^G' - G^G. (F.6) 
ds ds ds ds 
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With this expression for ^ we can now easily calculate ^ {^^^ l^o))- Specifically, 
d/^2d-ff,^A dGdH^^^ ^dGdii",,, ^2d^^,,, ^^^H d\(l)o) 



ds \ ds J ds ds ds ds ds^ ds ds 
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